{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,20]],"date-time":"2026-02-20T19:43:28Z","timestamp":1771616608828,"version":"3.50.1"},"reference-count":50,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2022,11,16]],"date-time":"2022-11-16T00:00:00Z","timestamp":1668556800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100007219","name":"Natural Science Foundation of Shanghai","doi-asserted-by":"publisher","award":["19ZR1435900"],"award-info":[{"award-number":["19ZR1435900"]}],"id":[{"id":"10.13039\/100007219","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Neurorobot."],"abstract":"<jats:sec><jats:title>Introduction<\/jats:title><jats:p>Existing multi-view-based 3D model classification methods have the problems of insufficient view refinement feature extraction and poor generalization ability of the network model, which makes it difficult to further improve the classification accuracy. To this end, this paper proposes a multi-view SoftPool attention convolutional network for 3D model classification tasks.<\/jats:p><\/jats:sec><jats:sec><jats:title>Methods<\/jats:title><jats:p>This method extracts multi-view features through ResNest and adaptive pooling modules, and the extracted features can better represent 3D models. Then, the results of the multi-view feature extraction processed using SoftPool are used as the Query for the self-attentive calculation, which enables the subsequent refinement extraction. We then input the attention scores calculated by Query and Key in the self-attention calculation into the mobile inverted bottleneck convolution, which effectively improves the generalization of the network model. Based on our proposed method, a compact 3D global descriptor is finally generated, achieving a high-accuracy 3D model classification performance.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>Experimental results showed that our method achieves 96.96% OA and 95.68% AA on ModelNet40 and 98.57% OA and 98.42% AA on ModelNet10.<\/jats:p><\/jats:sec><jats:sec><jats:title>Discussion<\/jats:title><jats:p>Compared with a multitude of popular methods, our algorithm model achieves the state-of-the-art classification accuracy.<\/jats:p><\/jats:sec>","DOI":"10.3389\/fnbot.2022.1029968","type":"journal-article","created":{"date-parts":[[2022,11,16]],"date-time":"2022-11-16T07:02:47Z","timestamp":1668582167000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":9,"title":["Multi-view SoftPool attention convolutional networks for 3D model classification"],"prefix":"10.3389","volume":"16","author":[{"given":"Wenju","family":"Wang","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiaolin","family":"Wang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Gang","family":"Chen","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Haoran","family":"Zhou","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1965","published-online":{"date-parts":[[2022,11,16]]},"reference":[{"key":"B1","doi-asserted-by":"publisher","DOI":"10.1155\/2020\/1314598","article-title":"Applicability of a single depth sensor in real-time 3d clothes simulation: augmented reality virtual dressing room using kinect sensor","author":"Adikari","year":"2020","journal-title":"Adv. Hum. Comput. Interact"},{"key":"B2","doi-asserted-by":"publisher","first-page":"3244","DOI":"10.1109\/TVCG.2018.2866793","article-title":"Veram: view-enhanced recurrent attention model for 3D shape classification","volume":"25","author":"Chen","year":"2019","journal-title":"IEEE Trans. Vis. Comput. Graph"},{"key":"B3","first-page":"3965","article-title":"Coatnet: marrying convolution and attention for all data sizes,","volume-title":"Advances in Neural Information Processing Systems, Vol. 34","author":"Dai","year":"2021"},{"key":"B4","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2020.107446","article-title":"Point attention network for semantic segmentation of 3D point clouds","author":"Feng","year":"2020","journal-title":"Pattern Recognit"},{"key":"B5","doi-asserted-by":"crossref","DOI":"10.1109\/CVPR.2018.00035","article-title":"Gvcnn: group-view convolutional neural networks for 3D shape recognition,","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","author":"Feng","year":"2018"},{"key":"B6","doi-asserted-by":"crossref","first-page":"3650","DOI":"10.1109\/ICRA40945.2020.9197426","article-title":"Ycb-m: a multi-camera rgb-d dataset for object recognition and 6D of pose estimation,","volume-title":"2020 IEEE International Conference on Robotics and Automation (ICRA)","author":"Grenzd\u00f6rffer","year":"2020"},{"key":"B7","doi-asserted-by":"publisher","first-page":"3986","DOI":"10.1109\/TIP.2019.2904460","article-title":"3d2seqviews: aggregating sequential views for 3d global feature learning by cnn with hierarchical attention aggregation","volume":"28","author":"Han","year":"","journal-title":"IEEE Trans. Image Process"},{"key":"B8","doi-asserted-by":"publisher","first-page":"658","DOI":"10.1109\/TIP.2018.2868426","article-title":"Seqviews2seqlabels: learning 3D global features via aggregating sequential views by rnn with attention","volume":"28","author":"Han","year":"","journal-title":"IEEE Trans. Image Process"},{"key":"B9","doi-asserted-by":"crossref","DOI":"10.1109\/CVPR.2016.90","article-title":"Deep residual learning for image recognition,","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","author":"He","year":"2016"},{"key":"B10","doi-asserted-by":"crossref","DOI":"10.1109\/CVPR42600.2020.01112","article-title":"Randla-net: efficient semantic segmentation of large-scale point clouds,","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","author":"Hu","year":"2020"},{"key":"B11","doi-asserted-by":"crossref","DOI":"10.1109\/CVPR.2017.243","article-title":"Densely connected convolutional networks,","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","author":"Huang","year":"2017"},{"key":"B12","doi-asserted-by":"crossref","DOI":"10.1109\/ICCVW.2019.00503","article-title":"Momen(e)t: Flavor the moments in learning to classify shapes,","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV) Workshops","author":"Joseph-Rivlin","year":"2019"},{"key":"B13","doi-asserted-by":"crossref","DOI":"10.1109\/CVPR.2018.00526","article-title":"Rotationnet: joint object categorization and pose estimation using multiviews from unsupervised viewpoints,","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","author":"Kanezaki","year":"2018"},{"key":"B14","doi-asserted-by":"crossref","first-page":"1135","DOI":"10.1109\/ICRA40945.2020.9197155","article-title":"A 3D-deep-learning-based augmented reality calibration method for robotic environments using depth sensor data,","volume-title":"2020 IEEE International Conference on Robotics and Automation (ICRA)","author":"K\u00e4stner","year":"2020"},{"key":"B15","doi-asserted-by":"crossref","first-page":"285","DOI":"10.1109\/3DIM.2005.71","article-title":"Scale selection for classification of point-sampled 3D surfaces,","volume-title":"Fifth International Conference on 3-D Digital Imaging and Modeling (3DIM'05)","author":"Lalonde","year":"2005"},{"key":"B16","doi-asserted-by":"crossref","DOI":"10.1109\/CVPR.2018.00959","article-title":"Pointgrid: a deep network for 3D shape understanding,","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","author":"Le","year":"2018"},{"key":"B17","doi-asserted-by":"crossref","first-page":"8152","DOI":"10.1109\/ICRA.2019.8794052","article-title":"Hierarchical depthwise graph convolutional neural network for 3D semantic segmentation of point clouds,","volume-title":"2019 International Conference on Robotics and Automation (ICRA)","author":"Liang","year":"2019"},{"key":"B18","doi-asserted-by":"publisher","first-page":"984","DOI":"10.1016\/j.ins.2020.09.057","article-title":"Hierarchical multi-view context modelling for 3D object classification and retrieval","volume":"547","author":"Liu","year":"2021","journal-title":"Inf. Sci"},{"key":"B19","doi-asserted-by":"publisher","first-page":"1291","DOI":"10.3390\/s20051291","article-title":"Study of postural stability features by using kinect depth sensors to assess body joint coordination patterns","volume":"20","author":"Liu","year":"2020","journal-title":"Sensors"},{"key":"B20","doi-asserted-by":"publisher","first-page":"1169","DOI":"10.1109\/TMM.2018.2875512","article-title":"Learning multi-view representation with lstm for 3D shape recognition and retrieval","volume":"21","author":"Ma","year":"2019","journal-title":"IEEE Trans. Multimedia"},{"key":"B21","doi-asserted-by":"crossref","first-page":"1560","DOI":"10.1109\/ICPR.2018.8546281","article-title":"3dmax-net: a multi-scale spatial contextual network for 3D point cloud semantic segmentation,","volume-title":"2018 24th International Conference on Pattern Recognition (ICPR)","author":"Ma","year":"2018"},{"key":"B22","doi-asserted-by":"crossref","first-page":"922","DOI":"10.1109\/IROS.2015.7353481","article-title":"Voxnet: a 3D convolutional neural network for real-time object recognition,","volume-title":"2015 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS)","author":"Maturana","year":"2015"},{"key":"B23","doi-asserted-by":"publisher","first-page":"152","DOI":"10.1016\/j.isprsjprs.2013.11.001","article-title":"Contextual classification of lidar data and building object detection in urban areas","volume":"87","author":"Niemeyer","year":"2014","journal-title":"ISPRS J. Photogram. Remote Sens"},{"key":"B24","author":"Paszke","year":"2017"},{"key":"B25","doi-asserted-by":"publisher","DOI":"10.1109\/3DV.2017.00020","article-title":"Compact model representation for 3D reconstruction","author":"Pontes","year":"2017","journal-title":"CoRR"},{"key":"B26","article-title":"Pointnet: Deep learning on point sets for 3d classification and segmentation,","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","author":"Qi","year":""},{"key":"B27","article-title":"Pointnet++: deep hierarchical feature learning on point sets in a metric space,","volume-title":"Advances in Neural Information Processing Systems, Vol. 3","author":"Qi","year":""},{"key":"B28","first-page":"3813","article-title":"Dense-resolution network for point cloud classification and segmentation,","volume-title":"Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision (WACV)","author":"Qiu","year":"2021"},{"key":"B29","doi-asserted-by":"crossref","DOI":"10.1109\/CVPR.2017.701","article-title":"Octnet: learning deep 3d representations at high resolutions,","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","author":"Riegler","year":"2017"},{"key":"B30","doi-asserted-by":"crossref","DOI":"10.1109\/CVPR.2018.00474","article-title":"Mobilenetv2: inverted residuals and linear bottlenecks,","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","author":"Sandler","year":"2018"},{"key":"B31","first-page":"10357","article-title":"Refining activation downsampling with softpool,","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV)","author":"Stergiou","year":"2021"},{"key":"B32","doi-asserted-by":"crossref","DOI":"10.1109\/ICCV.2015.114","article-title":"Multi-view convolutional neural networks for 3D shape recognition,","volume-title":"Proceedings of the IEEE International Conference on Computer Vision (ICCV)","author":"Su","year":"2015"},{"key":"B33","doi-asserted-by":"publisher","first-page":"868","DOI":"10.1109\/TIP.2020.3039378","article-title":"Drcnn: dynamic routing convolutional neural network for multi-view 3D object recognition","volume":"30","author":"Sun","year":"2021","journal-title":"IEEE Trans. Image Process"},{"key":"B34","doi-asserted-by":"crossref","DOI":"10.1109\/ICCV.2019.00167","article-title":"Revisiting point cloud classification: a new benchmark dataset and classification model on real-world data,","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV)","author":"Uy","year":"2019"},{"key":"B35","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1906.01592","article-title":"Dominant set clustering and pooling for multi-view 3D object recognition","author":"Wang","year":"2019","journal-title":"CoRR"},{"key":"B36","doi-asserted-by":"publisher","DOI":"10.3390\/rs10040612","article-title":"Msnet: multi-scale convolutional network for point cloud classification","author":"Wang","year":"2018","journal-title":"Remote Sens"},{"key":"B37","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3326362","article-title":"Dynamic graph cnn for learning on point clouds","volume":"38","author":"Wang","year":"2019","journal-title":"ACM Trans. Graph"},{"key":"B38","doi-asserted-by":"crossref","DOI":"10.1109\/CVPR42600.2020.00192","article-title":"View-gcn: view-based graph convolutional network for 3d shape analysis,","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","author":"Wei","year":"2020"},{"key":"B39","doi-asserted-by":"publisher","first-page":"8855","DOI":"10.1109\/TIP.2020.3019925","article-title":"Point2spatialcapsule: aggregating features and spatial relationships of local regions on point clouds using spatial-aware capsules","volume":"29","author":"Wen","year":"2020","journal-title":"IEEE Trans. Image Process"},{"key":"B40","article-title":"3D shapenets: a deep representation for volumetric shapes,","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","author":"Wu","year":"2015"},{"key":"B41","doi-asserted-by":"crossref","first-page":"3152","DOI":"10.1007\/978-3-030-88007-1_13","article-title":"Single-view 3D object reconstruction from shape priors in memory,","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","author":"Yang","year":"2021"},{"key":"B42","doi-asserted-by":"crossref","DOI":"10.1109\/ICCV.2019.00760","article-title":"Learning relationships for multi-view 3D object recognition,","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV)","author":"Yang","year":"2019"},{"key":"B43","doi-asserted-by":"publisher","first-page":"155","DOI":"10.1016\/j.isprsjprs.2020.11.011","article-title":"Automatic 3D building reconstruction from multi-view aerial images with deep learning","volume":"171","author":"Yu","year":"2021","journal-title":"ISPRS J. Photogram. Remote Sens"},{"key":"B44","doi-asserted-by":"crossref","DOI":"10.1109\/CVPR.2018.00027","article-title":"Multi-view harmonized bilinear network for 3D object recognition,","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","author":"Yu","year":"2018"},{"key":"B45","doi-asserted-by":"publisher","first-page":"55991","DOI":"10.1109\/ACCESS.2020.2981357","article-title":"Point cloud classification model based on a dual-input deep network framework","volume":"8","author":"Zhai","year":"2020","journal-title":"IEEE Access"},{"key":"B46","first-page":"7354","article-title":"Self-attention generative adversarial networks,","volume-title":"Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research","author":"Zhang","year":"2019"},{"key":"B47","first-page":"2736","article-title":"Resnest: split-attention networks,","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops","author":"Zhang","year":"2022"},{"key":"B48","doi-asserted-by":"publisher","first-page":"487","DOI":"10.1016\/j.neucom.2020.06.095","article-title":"Local K-NNS pattern in omni-direction graph convolution neural network for 3D point clouds","volume":"413","author":"Zhang","year":"2020","journal-title":"Neurocomputing"},{"key":"B49","first-page":"1","article-title":"Improved adam optimizer for deep neural networks,","volume-title":"2018 IEEE\/ACM 26th International Symposium on Quality of Service (IWQoS)","author":"Zhang","year":"2018"},{"key":"B50","doi-asserted-by":"publisher","first-page":"87","DOI":"10.1109\/TCE.2021.3057137","article-title":"End-to-end 6dof pose estimation from monocular rgb images","volume":"67","author":"Zou","year":"2021","journal-title":"IEEE Trans. Consum. Electron"}],"container-title":["Frontiers in Neurorobotics"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fnbot.2022.1029968\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,11,16]],"date-time":"2022-11-16T07:03:02Z","timestamp":1668582182000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fnbot.2022.1029968\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,11,16]]},"references-count":50,"alternative-id":["10.3389\/fnbot.2022.1029968"],"URL":"https:\/\/doi.org\/10.3389\/fnbot.2022.1029968","relation":{},"ISSN":["1662-5218"],"issn-type":[{"value":"1662-5218","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,11,16]]},"article-number":"1029968"}}