{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:21:31Z","timestamp":1750220491011,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":31,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,10,23]],"date-time":"2021-10-23T00:00:00Z","timestamp":1634947200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,10,23]]},"DOI":"10.1145\/3495018.3495058","type":"proceedings-article","created":{"date-parts":[[2022,3,14]],"date-time":"2022-03-14T17:33:51Z","timestamp":1647279231000},"page":"230-237","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Scale-invariant Convolutional Capsule Network"],"prefix":"10.1145","author":[{"given":"Zihan","family":"Li","sequence":"first","affiliation":[{"name":"Dalian University of Technology, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yuqiu","family":"Kong","sequence":"additional","affiliation":[{"name":"Dalian University of Technology, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Baocai","family":"Yin","sequence":"additional","affiliation":[{"name":"Dalian University of Technology, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2022,3,14]]},"reference":[{"volume-title":"Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks","author":"Lecun Yann","key":"e_1_3_2_1_1_1","unstructured":"Yann Lecun , Yoshua Bengio , Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks , vol. 3361 , no. 10, 1995. Yann Lecun, Yoshua Bengio, Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks, vol. 3361, no. 10, 1995."},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.5555\/1886436.1886447"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298701"},{"key":"e_1_3_2_1_4_1","volume-title":"Scale-invariant convolutional neural networks. arXiv preprint arXiv: 1411.6369","author":"Yichong Xu","year":"2014","unstructured":"Xu Yichong , Xiao Tianjun , Zhang Jiaxing , Yang Kuiyuan , and Zheng Zhang . Scale-invariant convolutional neural networks. arXiv preprint arXiv: 1411.6369 , 2014 . Xu Yichong, Xiao Tianjun, Zhang Jiaxing, Yang Kuiyuan, and Zheng Zhang. Scale-invariant convolutional neural networks. arXiv preprint arXiv: 1411.6369, 2014."},{"key":"e_1_3_2_1_5_1","first-page":"7366","article-title":"Equivariance over scale","author":"Worrall Daniel","year":"2019","unstructured":"Daniel Worrall and Max Welling . Deep scale-spaces : Equivariance over scale . In Advances in Neural Information Processing Systems , pp. 7366 - 7378 , 2019 . Daniel Worrall and Max Welling. Deep scale-spaces: Equivariance over scale. In Advances in Neural Information Processing Systems, pp. 7366-7378, 2019.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.106"},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.433"},{"key":"e_1_3_2_1_8_1","first-page":"2017","volume-title":"Advances in neural information processing systems","author":"Jaderberg Max","year":"2015","unstructured":"Max Jaderberg , Karen Simonyan , Andrew Zisserman , Spatial transformer networks . In Advances in neural information processing systems , pp. 2017 - 2025 , 2015 . Max Jaderberg, Karen Simonyan, Andrew Zisserman, Spatial transformer networks. In Advances in neural information processing systems, pp. 2017-2025, 2015."},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/361237.361242"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.5555\/850924.851523"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2011.6126544"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/72.134287"},{"volume-title":"Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs","author":"Chen Liang-Chieh","key":"e_1_3_2_1_13_1","unstructured":"Liang-Chieh Chen , George Papandreou , Iasonas Kokkinos , Kevin Murphy , and Alan L Yuille . Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs . IEEE transactions on pattern analysis and machine intelligence, vol. 40 , no. 4, pp. 834-848, 2017. Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L Yuille. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 4, pp. 834-848, 2017."},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.690"},{"key":"e_1_3_2_1_15_1","volume-title":"Blurring the line between structure and learning to optimize and adapt receptive fields. arXiv preprint arXiv","author":"Shelhamer Evan","year":"1904","unstructured":"Evan Shelhamer , Dequan Wang , and Trevor Darrell . Blurring the line between structure and learning to optimize and adapt receptive fields. arXiv preprint arXiv : 1904 .11487, 2019. Evan Shelhamer, Dequan Wang, and Trevor Darrell. Blurring the line between structure and learning to optimize and adapt receptive fields. arXiv preprint arXiv: 1904.11487, 2019."},{"key":"e_1_3_2_1_16_1","first-page":"392","volume-title":"European conference on computer vision","author":"Yunchao Gong","year":"2014","unstructured":"Gong Yunchao , Wang Liwei , Guo Ruiqi , and Lazebnik Svetlana . Multiscale orderless pooling of deep convolutional activation features . In European conference on computer vision , pp. 392 - 407 , 2014 . Gong Yunchao, Wang Liwei, Guo Ruiqi, and Lazebnik Svetlana. Multiscale orderless pooling of deep convolutional activation features. In European conference on computer vision, pp. 392-407, 2014."},{"key":"e_1_3_2_1_17_1","first-page":"904","volume-title":"The IEEE Winter Conference on Applications of Computer Vision","author":"Wenju Xu","year":"2020","unstructured":"Xu Wenju , Wang Guanghui , Sullivan Alan , and Zhang Ziming . Towards learning affine-invariant representations via data-efficient cnns . In The IEEE Winter Conference on Applications of Computer Vision , pp. 904 - 913 , 2020 . Xu Wenju, Wang Guanghui, Sullivan Alan, and Zhang Ziming. Towards learning affine-invariant representations via data-efficient cnns. In The IEEE Winter Conference on Applications of Computer Vision, pp. 904-913, 2020."},{"key":"e_1_3_2_1_18_1","first-page":"13359","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Xinjiang Wang","year":"2020","unstructured":"Wang Xinjiang , Zhang Shilong , Yu Zhuoran , Feng Litong , and Zhang Wayne . Scale-equalizing pyramid convolution for object detection . In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition , pp. 13359 - 13368 , 2020 . Wang Xinjiang, Zhang Shilong, Yu Zhuoran, Feng Litong, and Zhang Wayne. Scale-equalizing pyramid convolution for object detection. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 13359-13368, 2020."},{"key":"e_1_3_2_1_19_1","first-page":"764","volume-title":"Proceedings of the IEEE international conference on computer vision","author":"Jifeng Dai","year":"2017","unstructured":"Dai Jifeng , Qi Haozhi , Xiong Yuwen , Li Yi , Zhang Guodong , Hu Han , and Wei Yichen . Deformable convolutional networks . In Proceedings of the IEEE international conference on computer vision , pp. 764 - 773 , 2017 . Dai Jifeng, Qi Haozhi, Xiong Yuwen, Li Yi, Zhang Guodong, Hu Han, and Wei Yichen. Deformable convolutional networks. In Proceedings of the IEEE international conference on computer vision, pp. 764-773, 2017."},{"volume-title":"Generalizing the hough transform to detect arbitrary shapes. Pattern recognition","author":"Ballard Dana H","key":"e_1_3_2_1_20_1","unstructured":"Dana H Ballard . Generalizing the hough transform to detect arbitrary shapes. Pattern recognition , vol. 13 , no. 2, pp. 111-122, 1981. Dana H Ballard. Generalizing the hough transform to detect arbitrary shapes. Pattern recognition, vol. 13, no. 2, pp. 111-122, 1981."},{"key":"e_1_3_2_1_21_1","first-page":"3856","volume-title":"Dynamic routing between capsules. Advances in neural information processing systems","author":"Sabour Sara","year":"2017","unstructured":"Sara Sabour , Nicholas Frosst , and Geoffrey E Hinton . Dynamic routing between capsules. Advances in neural information processing systems , vol. 30 , pp. 3856 - 3866 , 2017 . Sara Sabour, Nicholas Frosst, and Geoffrey E Hinton. Dynamic routing between capsules. Advances in neural information processing systems, vol. 30, pp. 3856-3866, 2017."},{"key":"e_1_3_2_1_22_1","volume-title":"International conference on learning representations","author":"Hinton Geoffrey E","year":"2018","unstructured":"Geoffrey E Hinton , Sara Sabour , and Nicholas Frosst . Matrix capsules with em routing . In International conference on learning representations , 2018 . Geoffrey E Hinton, Sara Sabour, and Nicholas Frosst. Matrix capsules with em routing. In International conference on learning representations, 2018."},{"key":"e_1_3_2_1_23_1","first-page":"15512","volume-title":"Advances in neural information processing systems","author":"Kosiorek Adam","year":"2019","unstructured":"Adam Kosiorek , Sara Sabour , Yee Whye Teh, and Geoffrey E Hinton. Stacked capsule autoencoders . In Advances in neural information processing systems , pp. 15512 - 15522 , 2019 . Adam Kosiorek, Sara Sabour, Yee Whye Teh, and Geoffrey E Hinton. Stacked capsule autoencoders. In Advances in neural information processing systems, pp. 15512-15522, 2019."},{"key":"e_1_3_2_1_24_1","unstructured":"M. Everingham L. Van Gool C. K. I. Williams J. Winn and A. Zisserman. The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results. http:\/\/www.pascalnetwork.org\/challenges\/VOC\/voc2012\/workshop\/index.html.  M. Everingham L. Van Gool C. K. I. Williams J. Winn and A. Zisserman. The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results. http:\/\/www.pascalnetwork.org\/challenges\/VOC\/voc2012\/workshop\/index.html."},{"key":"e_1_3_2_1_25_1","volume-title":"Microsoft coco: Common objects in context","author":"Lin Tsung-Yi","year":"2015","unstructured":"Tsung-Yi Lin , Michael Maire , Serge Belongie , Lubomir Bourdev , Ross Girshick , James Hays , Pietro Perona , Deva Ramanan , C. Lawrence Zitnick , and Piotr Dollar . Microsoft coco: Common objects in context , 2015 . Tsung-Yi Lin, Michael Maire, Serge Belongie, Lubomir Bourdev, Ross Girshick, James Hays, Pietro Perona, Deva Ramanan, C. Lawrence Zitnick, and Piotr Dollar. Microsoft coco: Common objects in context, 2015."},{"key":"e_1_3_2_1_26_1","volume-title":"Yolov3: An incremental improvement. arXiv preprint arXiv","author":"Redmon Joseph","year":"1804","unstructured":"Joseph Redmon and Ali Farhadi . Yolov3: An incremental improvement. arXiv preprint arXiv : 1804 .02767, 2018. Joseph Redmon and Ali Farhadi. Yolov3: An incremental improvement. arXiv preprint arXiv: 1804.02767, 2018."},{"key":"e_1_3_2_1_27_1","volume-title":"Gridmask data augmentation. arXiv preprint arXiv","author":"Chen Pengguang","year":"2001","unstructured":"Pengguang Chen . Gridmask data augmentation. arXiv preprint arXiv : 2001 .04086, 2020. Pengguang Chen. Gridmask data augmentation. arXiv preprint arXiv: 2001.04086, 2020."},{"key":"e_1_3_2_1_28_1","first-page":"6023","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"Sangdoo Yun","year":"2019","unstructured":"Yun Sangdoo , Han Dongyoon , Oh Seong Joon , Chun Sanghyuk , Choe Junsuk , and Yoo Youngjoon . Cutmix : Regularization strategy to train strong classifiers with localizable features . In Proceedings of the IEEE International Conference on Computer Vision , pp. 6023 - 6032 , 2019 . Yun Sangdoo, Han Dongyoon, Oh Seong Joon, Chun Sanghyuk, Choe Junsuk, and Yoo Youngjoon. Cutmix: Regularization strategy to train strong classifiers with localizable features. In Proceedings of the IEEE International Conference on Computer Vision, pp. 6023-6032, 2019."},{"key":"e_1_3_2_1_29_1","volume-title":"Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv","author":"Alexey Bochkovskiy","year":"2004","unstructured":"Bochkovskiy Alexey , Wang Chienyao , and Liao Hong-Yuan Mark . Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv : 2004 .10934, 2020. Bochkovskiy Alexey, Wang Chienyao, and Liao Hong-Yuan Mark. Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv: 2004.10934, 2020."},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00075"},{"key":"e_1_3_2_1_31_1","first-page":"10781","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition","author":"Mingxing Tan","year":"2020","unstructured":"Tan Mingxing , Pang Ruoming , and Le Quoc V. Efficientdet : Scalable and efficient object detection . In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition , pp. 10781 - 10790 , 2020 . Tan Mingxing, Pang Ruoming, and Le Quoc V. Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp. 10781-10790, 2020."}],"event":{"name":"AIAM2021: 2021 3rd International Conference on Artificial Intelligence and Advanced Manufacture","acronym":"AIAM2021","location":"Manchester United Kingdom"},"container-title":["2021 3rd International Conference on Artificial Intelligence and Advanced Manufacture"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3495018.3495058","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3495018.3495058","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:49:20Z","timestamp":1750193360000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3495018.3495058"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,10,23]]},"references-count":31,"alternative-id":["10.1145\/3495018.3495058","10.1145\/3495018"],"URL":"https:\/\/doi.org\/10.1145\/3495018.3495058","relation":{},"subject":[],"published":{"date-parts":[[2021,10,23]]},"assertion":[{"value":"2022-03-14","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}