{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T10:47:04Z","timestamp":1776077224308,"version":"3.50.1"},"reference-count":46,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2022,10,23]],"date-time":"2022-10-23T00:00:00Z","timestamp":1666483200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>One of the essential layers in most Convolutional Neural Networks (CNNs) is the pooling layer, which is placed right after the convolution layer, effectively downsampling the input and reducing the computational power required. Different pooling methods have been proposed over the years, each with its own advantages and disadvantages, rendering them a better fit for different applications. We introduce a benchmark between many of these methods that highlights an optimal choice for different scenarios depending on each project\u2019s individual needs, whether it is detail retention, performance, or overall computational speed requirements.<\/jats:p>","DOI":"10.3390\/a15110391","type":"journal-article","created":{"date-parts":[[2022,10,23]],"date-time":"2022-10-23T20:43:50Z","timestamp":1666557830000},"page":"391","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":28,"title":["Convolutional Neural Networks: A Roundup and Benchmark of Their Pooling Layer Variants"],"prefix":"10.3390","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9528-4349","authenticated-orcid":false,"given":"Nikolaos-Ioannis","family":"Galanis","sequence":"first","affiliation":[{"name":"MLV Research Group, Department of Computer Science, International Hellenic University, 65404 Kavala, Greece"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6448-5524","authenticated-orcid":false,"given":"Panagiotis","family":"Vafiadis","sequence":"additional","affiliation":[{"name":"MLV Research Group, Department of Computer Science, International Hellenic University, 65404 Kavala, Greece"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4473-4631","authenticated-orcid":false,"given":"Kostas-Gkouram","family":"Mirzaev","sequence":"additional","affiliation":[{"name":"MLV Research Group, Department of Computer Science, International Hellenic University, 65404 Kavala, Greece"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5545-1499","authenticated-orcid":false,"given":"George A.","family":"Papakostas","sequence":"additional","affiliation":[{"name":"MLV Research Group, Department of Computer Science, International Hellenic University, 65404 Kavala, Greece"}]}],"member":"1968","published-online":{"date-parts":[[2022,10,23]]},"reference":[{"key":"ref_1","unstructured":"Forsyth, D.A., and Ponce, J. (2002). Computer Vision: A Modern Approach, Prentice Hall."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"463","DOI":"10.1113\/jphysiol.2006.118976","article-title":"What simple and complex cells compute","volume":"577","author":"Carandini","year":"2006","journal-title":"J. Physiol."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1113\/jphysiol.1978.sp012488","article-title":"Spatial summation in the receptive fields of simple cells in the cat\u2019s striate cortex","volume":"283","author":"Movshon","year":"1978","journal-title":"J. Physiol."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Fukushima, K., and Miyake, S. (1982). Neocognitron: A self-organizing neural network model for a mechanism of visual pattern recognition. Competition and Cooperation in Neural Nets, Springer.","DOI":"10.1007\/978-3-642-46466-9_18"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Lin, Y., Lv, F., Zhu, S., Yang, M., Cour, T., Yu, K., Cao, L., and Huang, T. (2011, January 20\u201325). Large-scale image classification: Fast feature extraction and svm training. Proceedings of the CVPR 2011, Colorado Springs, CO, USA.","DOI":"10.1109\/CVPR.2011.5995477"},{"key":"ref_6","unstructured":"Zhang, H., Berg, A.C., Maire, M., and Malik, J. (2006, January 17\u201322). SVM-KNN: Discriminative nearest neighbor classification for visual category recognition. Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR\u201906), New York, NY, USA."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"879","DOI":"10.1007\/s00521-019-04296-5","article-title":"Interpretation of intelligence in CNN-pooling processes: A methodological survey","volume":"32","author":"Akhtar","year":"2020","journal-title":"Neural Comput. Appl."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"303","DOI":"10.2478\/fcds-2019-0016","article-title":"Implications of pooling strategies in convolutional neural networks: A deep insight","volume":"44","author":"Sharma","year":"2019","journal-title":"Found. Comput. Decis. Sci."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"5455","DOI":"10.1007\/s10462-020-09825-6","article-title":"A survey of the recent architectures of deep convolutional neural networks","volume":"53","author":"Khan","year":"2020","journal-title":"Artif. Intell. Rev."},{"key":"ref_10","unstructured":"Gholamalinezhad, H., and Khosravi, H. (2020). Pooling Methods in Deep Neural Networks, a Review. arXiv."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"5321","DOI":"10.1007\/s00521-022-06953-8","article-title":"Pooling in convolutional neural networks for medical image analysis: A survey and an empirical study","volume":"34","author":"Nirthika","year":"2022","journal-title":"Neural Comput. Appl."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Yamaguchi, K., Sakamoto, K., Akabane, T., and Fujimoto, Y. (1990, January 18\u201322). A neural network for speaker-independent isolated word recognition. Proceedings of the First International Conference on Spoken Language Processing, Kobe, Japan.","DOI":"10.21437\/ICSLP.1990-282"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Murray, N., and Perronnin, F. (2014, January 23\u201328). Generalized Max pooling. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.317"},{"key":"ref_14","unstructured":"Thoma, M. (2022, September 08). LaTeX Examples. Available online: https:\/\/github.com\/MartinThoma\/LaTeX-examples."},{"key":"ref_15","unstructured":"Graham, B. (2014). Fractional Max-pooling. arXiv."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"2339","DOI":"10.1109\/LSP.2015.2480802","article-title":"Deeppano: Deep panoramic representation for 3-d shape recognition","volume":"22","author":"Shi","year":"2015","journal-title":"IEEE Signal Process. Lett."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"960","DOI":"10.1016\/j.dsp.2013.01.004","article-title":"Dictionary learning based sparse coefficients for audio classification with Max and Average pooling","volume":"23","author":"Zubair","year":"2013","journal-title":"Digit. Signal Process."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1016\/j.neunet.2016.07.003","article-title":"Rank-based pooling for deep convolutional neural networks","volume":"83","author":"Shi","year":"2016","journal-title":"Neural Netw."},{"key":"ref_19","unstructured":"Lee, C.Y., Gallagher, P.W., and Tu, Z. (2016, January 9\u201311). Generalizing pooling functions in convolutional neural networks: Mixed, gated, and tree. Proceedings of the Artificial Intelligence and Statistics, Cadiz, Spain."},{"key":"ref_20","unstructured":"Sermanet, P., Chintala, S., and LeCun, Y. (2012, January 11\u201315). Convolutional neural networks applied to house numbers digit classification. Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012), Tsukuba, Japan."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"9371","DOI":"10.1007\/s10586-018-2165-4","article-title":"Weighted pooling for image recognition of deep convolutional neural networks","volume":"22","author":"Zhu","year":"2019","journal-title":"Clust. Comput."},{"key":"ref_22","unstructured":"Zeiler, M.D., and Fergus, R. (2013). Stochastic pooling for regularization of deep convolutional neural networks. arXiv."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1007\/s13042-010-0001-0","article-title":"Understanding bag-of-words model: A statistical framework","volume":"1","author":"Zhang","year":"2010","journal-title":"Int. J. Mach. Learn. Cybern."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"1904","DOI":"10.1109\/TPAMI.2015.2389824","article-title":"Spatial pyramid pooling in deep convolutional networks for visual recognition","volume":"37","author":"He","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_25","unstructured":"(2021, May 14). ResearchGate. Available online: https:\/\/tinyurl.com\/researchgateSPPfigure."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"1788","DOI":"10.1109\/LSP.2016.2637355","article-title":"Look wider to match image patches with convolutional neural networks","volume":"24","author":"Park","year":"2016","journal-title":"IEEE Signal Process. Lett."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"3481","DOI":"10.1109\/TFUZZ.2020.3024023","article-title":"Fuzzy Pooling","volume":"29","author":"Diamantis","year":"2020","journal-title":"IEEE Trans. Fuzzy Syst."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1145\/3065386","article-title":"ImageNet classification with deep convolutional neural networks","volume":"60","author":"Krizhevsky","year":"2017","journal-title":"Commun. ACM"},{"key":"ref_29","unstructured":"Schuurmans, M., Berman, M., and Blaschko, M.B. (2018). Efficient semantic image segmentation with superpixel pooling. arXiv."},{"key":"ref_30","unstructured":"Rippel, O., Snoek, J., and Adams, R.P. (2015). Spectral representations for convolutional neural networks. arXiv."},{"key":"ref_31","unstructured":"Zhang, H., and Ma, J. (2018). Hartley Spectral Pooling for Deep Learning. arXiv."},{"key":"ref_32","unstructured":"Williams, T., and Li, R. (May, January 30). Wavelet pooling for convolutional neural networks. Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"1310","DOI":"10.1109\/LSP.2016.2589962","article-title":"Deep CNNs Along the Time Axis with Intermap Pooling for Robustness to Spectral Variations","volume":"23","author":"Lee","year":"2016","journal-title":"IEEE Signal Process. Lett."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Ayachi, R., Afif, M., Said, Y., and Atri, M. (2018, January 18\u201320). Strided convolution instead of Max pooling for memory efficiency of convolutional neural networks. Proceedings of the International Conference on the Sciences of Electronics, Technologies of Information and Telecommunications, Genoa, Italy and Hammammet, Tunisia.","DOI":"10.1007\/978-3-030-21005-2_23"},{"key":"ref_35","unstructured":"Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., and Tian, Q. (November, January 27). Centernet: Keypoint triplets for object detection. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Korea."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Law, H., and Deng, J. (2018, January 8\u201314). Cornernet: Detecting objects as paired keypoints. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01264-9_45"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"3349","DOI":"10.1109\/TPAMI.2020.2983686","article-title":"Deep high-resolution representation learning for visual recognition","volume":"43","author":"Wang","year":"2020","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Liu, S., Qi, L., Qin, H., Shi, J., and Jia, J. (2018, January 18\u201322). Path aggregation network for instance segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00913"},{"key":"ref_39","unstructured":"Gao, Z., Wang, L., and Wu, G. (November, January 27). Lip: Local importance-based pooling. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Korea."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"115084","DOI":"10.1016\/j.eswa.2021.115084","article-title":"Universal pooling\u2014A new pooling method for convolutional neural networks","volume":"180","author":"Hyun","year":"2021","journal-title":"Expert Syst. Appl."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Stergiou, A., Poppe, R., and Kalliatakis, G. (2021). Refining activation downsampling with SoftPool. arXiv.","DOI":"10.1109\/ICCV48922.2021.01019"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1023\/A:1014573219977","article-title":"A taxonomy and evaluation of dense two-frame stereo correspondence algorithms","volume":"47","author":"Scharstein","year":"2002","journal-title":"Int. J. Comput. Vis."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Vedaldi, A., and Lenc, K. (2015, January 26\u201330). Matconvnet: Convolutional neural networks for matlab. Proceedings of the 23rd ACM International Conference on Multimedia, Brisbane, Australia.","DOI":"10.1145\/2733373.2807412"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"2032","DOI":"10.1364\/JOSAA.7.002032","article-title":"Contrast in complex images","volume":"7","author":"Peli","year":"1990","journal-title":"JOSA A"},{"key":"ref_45","unstructured":"Instruments, N. (2022, September 08). Peak Signal-To-Noise Ratio as an Image Quality Metric. Available online: https:\/\/www.ni.com\/en-us\/innovations\/white-papers\/11\/peak-signal-to-noise-ratio-as-an-image-quality-metric.html."},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"600","DOI":"10.1109\/TIP.2003.819861","article-title":"Image quality assessment: From error visibility to structural similarity","volume":"13","author":"Wang","year":"2004","journal-title":"IEEE Trans. Image Process."}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/15\/11\/391\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T01:01:11Z","timestamp":1760144471000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/15\/11\/391"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,10,23]]},"references-count":46,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2022,11]]}},"alternative-id":["a15110391"],"URL":"https:\/\/doi.org\/10.3390\/a15110391","relation":{},"ISSN":["1999-4893"],"issn-type":[{"value":"1999-4893","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,10,23]]}}}