{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,30]],"date-time":"2026-04-30T19:16:12Z","timestamp":1777576572796,"version":"3.51.4"},"reference-count":30,"publisher":"Walter de Gruyter GmbH","issue":"2","license":[{"start":{"date-parts":[[2025,2,5]],"date-time":"2025-02-05T00:00:00Z","timestamp":1738713600000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc-nd\/4.0"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,3,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Almost all computer vision tasks rely on convolutional neural networks and transformers, both of which require extensive computations. With the increasingly large size of images, it becomes challenging to input these images directly. Therefore, in typical cases, we downsample the images to a reasonable size before proceeding with subsequent tasks. However, the downsampling process inevitably discards some fine-grained information, leading to network performance degradation. Existing methods, such as strided convolution and various pooling techniques, struggle to address this issue effectively. To overcome this limitation, we propose a generalized downsampling module, Adaptive Separation Fusion Downsampling (ASFD). ASFD adaptively captures intra- and inter-region attentional relationships and preserves feature representations lost during downsampling through fusion. We validate ASFD on representative computer vision tasks, including object detection and image classification. Specifically, we incorporated ASFD into the YOLOv7 object detection model and several classification models. Experiments demonstrate that the modified YOLOv7 architecture surpasses state-of-the-art models in object detection, particularly excelling in small object detection. Additionally, our method outperforms commonly used downsampling techniques in classification tasks. Furthermore, ASFD functions as a plug-and-play module compatible with various network architectures.<\/jats:p>","DOI":"10.2478\/jaiscr-2025-0010","type":"journal-article","created":{"date-parts":[[2025,2,5]],"date-time":"2025-02-05T18:38:28Z","timestamp":1738780708000},"page":"197-210","source":"Crossref","is-referenced-by-count":1,"title":["Adaptive Separation Fusion: A Novel Downsampling Approach in CNNS"],"prefix":"10.2478","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2820-0405","authenticated-orcid":false,"given":"Xia","family":"Ji","sequence":"first","affiliation":[{"name":"School of Computer Science and Technology , Anhui University"},{"name":"Anhui Provincial International Joint Research Center for Advanced Technology in Medical Imaging"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-5363-4271","authenticated-orcid":false,"given":"Jinglong","family":"Chang","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology , Anhui University"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-7287-1302","authenticated-orcid":false,"given":"Yapeng","family":"Ji","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology , Anhui University"}]}],"member":"374","published-online":{"date-parts":[[2025,2,5]]},"reference":[{"key":"2026042813120169271_j_jaiscr-2025-0010_ref_001","unstructured":"K. Simonyan and A. Zisserman, \u201cVery deep convolutional networks for large-scale image recognition,\u201d 2014."},{"key":"2026042813120169271_j_jaiscr-2025-0010_ref_002","doi-asserted-by":"crossref","unstructured":"K. He, X. Zhang, S. Ren, and J. Sun, \u201cDeep residual learning for image recognition,\u201d in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770\u2013778, 2016.","DOI":"10.1109\/CVPR.2016.90"},{"key":"2026042813120169271_j_jaiscr-2025-0010_ref_003","doi-asserted-by":"crossref","unstructured":"R. Girshick, J. Donahue, T. Darrell, and J. Malik, \u201cRich feature hierarchies for accurate object detection and semantic segmentation,\u201d in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 580\u2013587, 2014.","DOI":"10.1109\/CVPR.2014.81"},{"key":"2026042813120169271_j_jaiscr-2025-0010_ref_004","doi-asserted-by":"crossref","unstructured":"O. Ronneberger, P. Fischer, and T. Brox, \u201cU-net: Convolutional networks for biomedical image segmentation,\u201d in Medical image computing and computer-assisted intervention\u2013MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18, pp. 234\u2013241, Springer, 2015.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"2026042813120169271_j_jaiscr-2025-0010_ref_005","doi-asserted-by":"crossref","unstructured":"W. Zhang, Z. Hong, L. Xiong, Z. Zeng, Z. Cai, and K. Tan, \u201cSinextnet: A new small object detection model for aerial images based on pp-yoloe,\u201d Journal of Artificial Intelligence and Soft Computing Research, vol. 14, no. 3, pp. 251\u2013265.","DOI":"10.2478\/jaiscr-2024-0014"},{"key":"2026042813120169271_j_jaiscr-2025-0010_ref_006","doi-asserted-by":"crossref","unstructured":"N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, \u201cEnd-to-end object detection with transformers,\u201d in European conference on computer vision, pp. 213\u2013229, Springer, 2020.","DOI":"10.1007\/978-3-030-58452-8_13"},{"key":"2026042813120169271_j_jaiscr-2025-0010_ref_007","doi-asserted-by":"crossref","unstructured":"Y.-L. Boureau, F. Bach, Y. LeCun, and J. Ponce, \u201cLearning mid-level features for recognition,\u201d in 2010 IEEE computer society conference on computer vision and pattern recognition, pp. 2559\u20132566, IEEE, 2010.","DOI":"10.1109\/CVPR.2010.5539963"},{"key":"2026042813120169271_j_jaiscr-2025-0010_ref_008","unstructured":"Y.-L. Boureau, J. Ponce, and Y. LeCun, \u201cA theoretical analysis of feature pooling in visual recognition,\u201d in Proceedings of the 27th international conference on machine learning (ICML-10), pp. 111\u2013118, 2010."},{"key":"2026042813120169271_j_jaiscr-2025-0010_ref_009","doi-asserted-by":"crossref","unstructured":"A. Stergiou, R. Poppe, and G. Kalliatakis, \u201cRefining activation downsampling with softpool,\u201d in Proceedings of the IEEE\/CVF international conference on computer vision, pp. 10357\u201310366, 2021.","DOI":"10.1109\/ICCV48922.2021.01019"},{"key":"2026042813120169271_j_jaiscr-2025-0010_ref_010","unstructured":"M. D. Zeiler and R. Fergus, \u201cStochastic pooling for regularization of deep convolutional neural networks,\u201d 2013."},{"key":"2026042813120169271_j_jaiscr-2025-0010_ref_011","doi-asserted-by":"crossref","unstructured":"D. Yu, H. Wang, P. Chen, and Z. Wei, \u201cMixed pooling for convolutional neural networks,\u201d in Rough Sets and Knowledge Technology: 9th International Conference, RSKT 2014, Shanghai, China, October 24-26, 2014, Proceedings 9, pp. 364\u2013375, Springer, 2014.","DOI":"10.1007\/978-3-319-11740-9_34"},{"key":"2026042813120169271_j_jaiscr-2025-0010_ref_012","doi-asserted-by":"crossref","unstructured":"C. Gulcehre, K. Cho, R. Pascanu, and Y. Bengio, \u201cLearned-norm pooling for deep feedforward and recurrent neural networks,\u201d in Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2014, Nancy, France, September 15-19, 2014. Proceedings, Part I 14, pp. 530\u2013546, Springer, 2014.","DOI":"10.1007\/978-3-662-44848-9_34"},{"key":"2026042813120169271_j_jaiscr-2025-0010_ref_013","doi-asserted-by":"crossref","unstructured":"S. Zhai, H. Wu, A. Kumar, Y. Cheng, Y. Lu, Z. Zhang, and R. Feris, \u201cS3pool: Pooling with stochastic spatial sampling,\u201d in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4970\u20134978, 2017.","DOI":"10.1109\/CVPR.2017.426"},{"key":"2026042813120169271_j_jaiscr-2025-0010_ref_014","doi-asserted-by":"crossref","unstructured":"Z. Gao, L. Wang, and G. Wu, \u201cLip: Local importance-based pooling,\u201d in Proceedings of the IEEE\/CVF International Conference on Computer Vision, pp. 3355\u20133364, 2019.","DOI":"10.1109\/ICCV.2019.00345"},{"key":"2026042813120169271_j_jaiscr-2025-0010_ref_015","unstructured":"J. Zhao and C. G. M. Snoek, \u201cLiftpool: Bidirectional convnet pooling,\u201d 2021."},{"key":"2026042813120169271_j_jaiscr-2025-0010_ref_016","unstructured":"Q. Zhu, J. Huang, N. Zheng, H. Gao, C. Li, Y. Xu, F. Zhao, et al., \u201cFouridown: factoring down-sampling into shuffling and superposing,\u201d Advances in Neural Information Processing Systems, vol. 36, 2024."},{"key":"2026042813120169271_j_jaiscr-2025-0010_ref_017","doi-asserted-by":"crossref","unstructured":"R. Sunkara and T. Luo, \u201cNo more strided convolutions or pooling: A new cnn building block for low-resolution images and small objects,\u201d in Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 443\u2013459, Springer, 2022.","DOI":"10.1007\/978-3-031-26409-2_27"},{"key":"2026042813120169271_j_jaiscr-2025-0010_ref_018","doi-asserted-by":"crossref","unstructured":"Z. Li, F. Liu, W. Yang, S. Peng, and J. Zhou, \u201cA survey of convolutional neural networks: analysis, applications, and prospects,\u201d IEEE transactions on neural networks and learning systems, vol. 33, no. 12, pp. 6999\u20137019, 2021.","DOI":"10.1109\/TNNLS.2021.3084827"},{"key":"2026042813120169271_j_jaiscr-2025-0010_ref_019","doi-asserted-by":"crossref","unstructured":"A. Krizhevsky, I. Sutskever, and G. E. Hinton, \u201cImagenet classification with deep convolutional neural networks,\u201d Communications of the ACM, vol. 60, no. 6, pp. 84\u201390, 2017.","DOI":"10.1145\/3065386"},{"key":"2026042813120169271_j_jaiscr-2025-0010_ref_020","unstructured":"A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, \u201cAn image is worth 16x16 words: Transformers for image recognition at scale,\u201d 2020."},{"key":"2026042813120169271_j_jaiscr-2025-0010_ref_021","doi-asserted-by":"crossref","unstructured":"S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, \u201cCbam: Convolutional block attention module,\u201d in Proceedings of the European conference on computer vision (ECCV), pp. 3\u201319, 2018.","DOI":"10.1007\/978-3-030-01234-2_1"},{"key":"2026042813120169271_j_jaiscr-2025-0010_ref_022","doi-asserted-by":"crossref","unstructured":"C.-Y. Wang, A. Bochkovskiy, and H.-Y. M. Liao, \u201cYolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors,\u201d in Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp. 7464\u20137475, 2023.","DOI":"10.1109\/CVPR52729.2023.00721"},{"key":"2026042813120169271_j_jaiscr-2025-0010_ref_023","doi-asserted-by":"crossref","unstructured":"H. Su, S. Wei, S. Liu, J. Liang, C. Wang, J. Shi, and X. Zhang, \u201cHq-isnet: High-quality instance segmentation for remote sensing imagery,\u201d Remote Sensing, vol. 12, no. 6, p. 989, 2020.","DOI":"10.3390\/rs12060989"},{"key":"2026042813120169271_j_jaiscr-2025-0010_ref_024","unstructured":"G. Jocher, \u201cYOLOv5 by Ultralytics,\u201d May 2020."},{"key":"2026042813120169271_j_jaiscr-2025-0010_ref_025","unstructured":"C. Li, L. Li, H. Jiang, K. Weng, Y. Geng, L. Li, Z. Ke, Q. Li, M. Cheng, W. Nie, Y. Li, B. Zhang, Y. Liang, L. Zhou, X. Xu, X. Chu, X. Wei, and X. Wei, \u201cYolov6: A single-stage object detection framework for industrial applications,\u201d 2022."},{"key":"2026042813120169271_j_jaiscr-2025-0010_ref_026","unstructured":"Z. Ge, S. Liu, F. Wang, Z. Li, and J. Sun, \u201cYolox: Exceeding yolo series in 2021,\u201d 2021."},{"key":"2026042813120169271_j_jaiscr-2025-0010_ref_027","unstructured":"G. Jocher, A. Chaurasia, and J. Qiu, \u201cUltralytics YOLO,\u201d Jan. 2023."},{"key":"2026042813120169271_j_jaiscr-2025-0010_ref_028","unstructured":"Y. Le and X. Yang, \u201cTiny imagenet visual recognition challenge,\u201d CS 231N, vol. 7, no. 7, p. 3, 2015."},{"key":"2026042813120169271_j_jaiscr-2025-0010_ref_029","unstructured":"A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, \u201cMobilenets: Efficient convolutional neural networks for mobile vision applications,\u201d 2017."},{"key":"2026042813120169271_j_jaiscr-2025-0010_ref_030","doi-asserted-by":"crossref","unstructured":"J. Chen, S.-h. Kao, H. He, W. Zhuo, S. Wen, C.-H. Lee, and S.-H. G. Chan, \u201cRun, don\u2019t walk: Chasing higher flops for faster neural networks,\u201d in Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 12021\u201312031, 2023.","DOI":"10.1109\/CVPR52729.2023.01157"}],"container-title":["Journal of Artificial Intelligence and Soft Computing Research"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/reference-global.com\/pdf\/10.2478\/jaiscr-2025-0010","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,28]],"date-time":"2026-04-28T19:28:46Z","timestamp":1777404526000},"score":1,"resource":{"primary":{"URL":"https:\/\/reference-global.com\/article\/10.2478\/jaiscr-2025-0010"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,2,5]]},"references-count":30,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2025,2,5]]},"published-print":{"date-parts":[[2025,3,1]]}},"alternative-id":["10.2478\/jaiscr-2025-0010"],"URL":"https:\/\/doi.org\/10.2478\/jaiscr-2025-0010","relation":{},"ISSN":["2449-6499"],"issn-type":[{"value":"2449-6499","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,2,5]]}}}