{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,2]],"date-time":"2026-06-02T01:54:24Z","timestamp":1780365264923,"version":"3.54.1"},"reference-count":50,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2021,9,13]],"date-time":"2021-09-13T00:00:00Z","timestamp":1631491200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,9,13]],"date-time":"2021-09-13T00:00:00Z","timestamp":1631491200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Neural Comput &amp; Applic"],"published-print":{"date-parts":[[2022,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Computer vision (CV) technologies are assisting the health care industry in many respects, i.e., disease diagnosis. However, as a pivotal procedure before and after surgery, the inventory work of surgical instruments has not been researched with the CV-powered technologies. To reduce the risk and hazard of surgical tools\u2019 loss, we propose a study of systematic surgical instrument classification and introduce a novel attention-based deep neural network called SKA-ResNet which is mainly composed of: (<jats:italic>a<\/jats:italic>) A feature extractor with selective kernel attention module to automatically adjust the receptive fields of neurons and enhance the learnt expression and (<jats:italic>b<\/jats:italic>) A multi-scale regularizer with KL-divergence as the constraint to exploit the relationships between feature maps. Our method is easily trained end-to-end in only one stage with few additional calculation burdens. Moreover, to facilitate our study, we create a new surgical instrument dataset called SID19 (with 19 kinds of surgical tools consisting of 3800 images) for the first time. Experimental results show the superiority of SKA-ResNet for the classification of surgical tools on SID19 when compared with state-of-the-art models. The classification accuracy of our method reaches up to 97.703%, which is well supportive for the inventory and recognition study of surgical tools. Also, our method can achieve state-of-the-art performance on four challenging fine-grained visual classification datasets.<\/jats:p>","DOI":"10.1007\/s00521-021-06368-x","type":"journal-article","created":{"date-parts":[[2021,9,13]],"date-time":"2021-09-13T06:02:37Z","timestamp":1631512957000},"page":"1577-1591","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["Adaptive kernel selection network with attention constraint for surgical instrument classification"],"prefix":"10.1007","volume":"34","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9929-2650","authenticated-orcid":false,"given":"Yaqing","family":"Hou","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Wenkai","family":"Zhang","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Qian","family":"Liu","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Hongwei","family":"Ge","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jun","family":"Meng","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Qiang","family":"Zhang","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Xiaopeng","family":"Wei","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2021,9,13]]},"reference":[{"issue":"5","key":"6368_CR1","doi-asserted-by":"publisher","first-page":"717","DOI":"10.1016\/S0731-7085(99)00272-1","volume":"22","author":"S Agatonovickustrin","year":"2000","unstructured":"Agatonovickustrin S, Beresford R (2000) Basic concepts of artificial neural network (ann) modeling and its application in pharmaceutical research. Journal of Pharmaceutical and Biomedical Analysis 22(5): 717\u2013727","journal-title":"J Pharm Biomed Anal"},{"key":"6368_CR2","doi-asserted-by":"crossref","unstructured":"Apostolopoulos ID, MT (2020) Covid-19: automatic detection from x-ray images utilizing transfer learning with convolutional neural networks. Phys Eng Sci Med 43(2): 635\u2013640","DOI":"10.1007\/s13246-020-00865-4"},{"key":"6368_CR3","doi-asserted-by":"crossref","unstructured":"Balakrishnan G, Zhao A, Sabuncu MR, Dalca AV, Guttag JV (2018) An unsupervised learning model for deformable medical image registration. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 9252\u20139260","DOI":"10.1109\/CVPR.2018.00964"},{"key":"6368_CR4","doi-asserted-by":"crossref","unstructured":"Bouget D, Benenson R, Omran M, Riffaud L, Schiele B, Jannin P (2015) Detecting surgical tools by modelling local appearance and global shape. IEEE Trans Med Imaging 34(12): 2603\u20132617","DOI":"10.1109\/TMI.2015.2450831"},{"key":"6368_CR5","doi-asserted-by":"crossref","unstructured":"Cai Z, Fan Q, Feris RS, Vasconcelos N (2016) A unified multi-scale deep convolutional neural network for fast object detection. Proceedings of the European Conference on Computer Vision (ECCV) pp. 354\u2013370","DOI":"10.1007\/978-3-319-46493-0_22"},{"key":"6368_CR6","doi-asserted-by":"crossref","unstructured":"Caraiman S, Zvoristeanu O, Burlacu A, Herghelegiu P (2019) Stereo vision based sensory substitution for the visually impaired. Sensors 19(12): 2771\u20132788","DOI":"10.3390\/s19122771"},{"key":"6368_CR7","doi-asserted-by":"crossref","unstructured":"Chilamkurthy S, Ghosh R, Tanamala S, Biviji M, Campeau NG, Venugopal VK, Mahajan V, Rao P, Warier P (2018) Deep learning algorithms for detection of critical findings in head ct scans: a retrospective study. The Lancet 392, (10162): 2388\u20132396","DOI":"10.1016\/S0140-6736(18)31645-3"},{"key":"6368_CR8","doi-asserted-by":"crossref","unstructured":"Filho PPR, Barros ACDS, Ramalho GLB, Pereira CR, Papa JP, De Albuquerque VHC, Tavares JMRS (2019) Automated recognition of lung diseases in ct images based on the optimum-path forest classifier. Neural Comput Appl, 31(2): 901\u2013914 (2019)","DOI":"10.1007\/s00521-017-3048-y"},{"key":"6368_CR9","doi-asserted-by":"crossref","unstructured":"Fu J, Zheng H, Mei T (2017) Look closer to see better: Recurrent attention convolutional neural network for fine-grained image recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 4438\u20134446","DOI":"10.1109\/CVPR.2017.476"},{"key":"6368_CR10","doi-asserted-by":"crossref","unstructured":"Gao Y, Beijbom O, Zhang N, Darrell T (2016) Compact bilinear pooling. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 317\u2013326","DOI":"10.1109\/CVPR.2016.41"},{"key":"6368_CR11","doi-asserted-by":"crossref","unstructured":"Garciaperazaherrera LC, Li W, Gruijthuijsen C, Devreker A, Attilakos G, Deprest J, Poorten EV, Stoyanov D, Vercauteren T, Ourselin S (2016) Real-time segmentation of non-rigid surgical tools based on deep learning and tracking. Lect Notes Comput Sci, 10170: 84\u201395","DOI":"10.1007\/978-3-319-54057-3_8"},{"key":"6368_CR12","doi-asserted-by":"crossref","unstructured":"He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 770\u2013778","DOI":"10.1109\/CVPR.2016.90"},{"key":"6368_CR13","doi-asserted-by":"crossref","unstructured":"Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 7132\u20137141","DOI":"10.1109\/CVPR.2018.00745"},{"key":"6368_CR14","doi-asserted-by":"crossref","unstructured":"Huang S, Xu Z, Tao D, Zhang Y (2016) Part-stacked cnn for fine-grained visual categorization. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 1173\u20131182","DOI":"10.1109\/CVPR.2016.132"},{"key":"6368_CR15","doi-asserted-by":"crossref","unstructured":"Jeganathan VE, Shah S (2009) Robotic technology in ophthalmic surgery. Curr Opin Ophthalmol 21: 75\u201380","DOI":"10.1097\/ICU.0b013e328333371d"},{"key":"6368_CR16","doi-asserted-by":"crossref","unstructured":"Kalan S, Chauhan S, Coelho RF, Orvieto MA, Camacho I, Palmer KJ, Patel VR (2010) History of robotic surgery. JRobotic Surg 4(3): 141\u2013147","DOI":"10.1007\/s11701-010-0202-2"},{"key":"6368_CR17","doi-asserted-by":"crossref","unstructured":"King BF (2018) Artificial intelligence and radiology: What will the future hold?. J Am College  Radiol 15(3): 501\u2013503","DOI":"10.1016\/j.jacr.2017.11.017"},{"key":"6368_CR18","doi-asserted-by":"crossref","unstructured":"Kong T, Yao A, Chen Y, Sun F (2016) Hypernet: Towards accurate region proposal generation and joint object detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 845\u2013853","DOI":"10.1109\/CVPR.2016.98"},{"key":"6368_CR19","doi-asserted-by":"crossref","unstructured":"Li H, Giger ML, Huynh BQ, Antropova N (2017) Deep learning in breast cancer risk assessment: evaluation of convolutional neural networks on a clinical dataset of full-field digital mammograms. Journal of medical imaging 4(4):041304","DOI":"10.1117\/1.JMI.4.4.041304"},{"key":"6368_CR20","doi-asserted-by":"crossref","unstructured":"Li P, Xie J, Wang Q, Gao Z (2018) Towards faster training of global covariance pooling networks by iterative matrix square root normalization. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 947\u2013955","DOI":"10.1109\/CVPR.2018.00105"},{"key":"6368_CR21","unstructured":"Li X, Hu X, Yang J (2019) Spatial group-wise enhance: Improving semantic feature learning in convolutional networks. arXiv preprint arXiv: Computer Vision and Pattern Recognition"},{"key":"6368_CR22","doi-asserted-by":"crossref","unstructured":"Li X, Wang W, Hu X, Yang J (2019) Selective kernel networks. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 510\u2013519","DOI":"10.1109\/CVPR.2019.00060"},{"key":"6368_CR23","doi-asserted-by":"crossref","unstructured":"Li Z, Yang Y, Liu X, Zhou F, Wen S, Xu W (2017) Dynamic computational time for visual attention. IEEE International Conference on Computer Vision (ICCV) pp. 1199\u20131209","DOI":"10.1109\/ICCVW.2017.145"},{"key":"6368_CR24","doi-asserted-by":"crossref","unstructured":"Lin D, Shen X, Lu C, Jia J (2015) Deep lac: deep localization, alignment and classification for fine-grained recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 1666\u20131674","DOI":"10.1109\/CVPR.2015.7298775"},{"key":"6368_CR25","doi-asserted-by":"crossref","unstructured":"Lin T, Roychowdhury A, Maji S (2015) Bilinear cnn models for fine-grained visual recognition. IEEE International Conference on Computer Vision (ICCV) pp. 1449\u20131457","DOI":"10.1109\/ICCV.2015.170"},{"key":"6368_CR26","doi-asserted-by":"crossref","unstructured":"Lin TY, Doll\u00e1r P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 2117\u20132125","DOI":"10.1109\/CVPR.2017.106"},{"key":"6368_CR27","doi-asserted-by":"crossref","unstructured":"Liu W, Anguelov D, Erhan D, Szegedy C, Berg AC (2016) Ssd: single shot multibox detector. Proceedings of the European Conference on Computer Vision (ECCV) pp. 21\u201337","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"6368_CR28","unstructured":"Liu X, Xia T, Wang J, Yang Y, Zhou F, Lin Y (2016) Fully convolutional attention networks for fine-grained recognition. arXiv preprint arXiv:1603.06765"},{"key":"6368_CR29","doi-asserted-by":"crossref","unstructured":"Luo W, Yang X, Mo X, Lu Y, Davis LS, Li J, Yang J, Lim S (2019) Cross-x learning for fine-grained visual categorization. IEEE International Conference on Computer Vision (ICCV) pp. 8242\u20138251","DOI":"10.1109\/ICCV.2019.00833"},{"key":"6368_CR30","doi-asserted-by":"crossref","unstructured":"Milletari F, Navab N, Ahmadi S (2016) V-net: fully convolutional neural networks for volumetric medical image segmentation. 2016 Fourth International Conference on 3D Vision (3DV) pp. 565\u2013571","DOI":"10.1109\/3DV.2016.79"},{"key":"6368_CR31","unstructured":"Pakhomov D, Premachandran V, Allan M, Azizian M, Navab N (2017) Deep residual learning for instrument segmentation in robotic surgery. arXiv preprint arXiv: Computer Vision and Pattern Recognition"},{"key":"6368_CR32","unstructured":"Park J, Woo S, Lee J, Kweon IS (2018) Bam: Bottleneck attention module. arXiv preprint arXiv: Computer Vision and Pattern Recognition"},{"key":"6368_CR33","unstructured":"Prati A, Shan C, Wang KI (2019) Sensors, vision and networks: From video surveillance to activity recognition and health monitoring. J Ambient Intell Smart Environ 11(1): 5\u201322"},{"key":"6368_CR34","doi-asserted-by":"crossref","unstructured":"Roth HR, Lu L, Liu J, Yao J, Seff A, Cherry KM, Kim L, Summers RM (2016) Improving computer-aided detection using convolutional neural networks and random view aggregation. IEEE Trans Med Imaging 35(5): 1170\u20131181","DOI":"10.1109\/TMI.2015.2482920"},{"key":"6368_CR35","doi-asserted-by":"crossref","unstructured":"Sanchezgarcia M, Martinezcantin R, Guerrero JJ (2020) Semantic and structural image segmentation for prosthetic vision. PLOS ONE, 15(1)","DOI":"10.1371\/journal.pone.0227677"},{"key":"6368_CR36","doi-asserted-by":"crossref","unstructured":"Sekaran K, Chandana P, Krishna NM, Kadry S (2019) Deep learning convolutional neural network (cnn) with gaussian mixture model for predicting pancreatic cancer. Multimedia Tools and Applications pp. 1\u201315","DOI":"10.1007\/s11042-019-7419-5"},{"key":"6368_CR37","unstructured":"Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556"},{"key":"6368_CR38","doi-asserted-by":"crossref","unstructured":"Sun M, Yuan Y, Zhou F, Ding E (2018) Multi-attention multi-class constraint for fine-grained image recognition. Proceedings of the European Conference on Computer Vision (ECCV) pp. 805\u2013821","DOI":"10.1007\/978-3-030-01270-0_49"},{"key":"6368_CR39","doi-asserted-by":"crossref","unstructured":"Suo Q, Ma F, Yuan Y, Huai M, Zhong W, Zhang A, Gao J (2017) Personalized disease prediction using a cnn-based similarity learning method. IEEE International Conference on Bioinformatics and Biomedicine pp. 811\u2013816","DOI":"10.1109\/BIBM.2017.8217759"},{"key":"6368_CR40","doi-asserted-by":"crossref","unstructured":"Wang S, Kang B, Ma J, Zeng X, Xiao M, Guo J, Cai M, Yang J, Li Y, Meng X, et al (2021) A deep learning algorithm using ct images to screen for corona virus disease (covid-19). European radiol pp. 1\u20139","DOI":"10.1007\/s00330-021-07715-1"},{"key":"6368_CR41","doi-asserted-by":"crossref","unstructured":"Wang Y, Morariu VI, Davis LS (2018) Learning a discriminative filter bank within a cnn for fine-grained recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 4148\u20134157","DOI":"10.1109\/CVPR.2018.00436"},{"key":"6368_CR42","doi-asserted-by":"crossref","unstructured":"Woo S, Park J, Lee J, Kweon IS (2018) Cbam: convolutional block attention module. Proceedings of the European Conference on Computer Vision (ECCV) pp. 3\u201319","DOI":"10.1007\/978-3-030-01234-2_1"},{"key":"6368_CR43","doi-asserted-by":"crossref","unstructured":"Xie S, Girshick R, Dollar P, Tu Z, He K (2017) Aggregated residual transformations for deep neural networks. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 5987\u20135995","DOI":"10.1109\/CVPR.2017.634"},{"key":"6368_CR44","doi-asserted-by":"crossref","unstructured":"Yang Z, Luo T, Wang D, Hu Z, Gao J, Wang L (2018) Learning to navigate for fine-grained classification. Proceedings of the European Conference on Computer Vision (ECCV) pp. 420\u2013435","DOI":"10.1007\/978-3-030-01264-9_26"},{"key":"6368_CR45","doi-asserted-by":"crossref","unstructured":"Zhang N, Donahue J, Girshick R, Darrell T (2014) Part-based r-cnns for fine-grained category detection. Proceedings of the European Conference on Computer Vision (ECCV) pp. 834\u2013849","DOI":"10.1007\/978-3-319-10590-1_54"},{"key":"6368_CR46","doi-asserted-by":"crossref","unstructured":"Zhao R, Yan R, Chen Z, Mao K, Wang P, Gao RX (2019) Deep learning and its applications to machine health monitoring. Mech Syst Signal Processing 115: 213\u2013237","DOI":"10.1016\/j.ymssp.2018.05.050"},{"key":"6368_CR47","doi-asserted-by":"crossref","unstructured":"Zhao Z, Chen Z, Voros S, Cheng X (2019) Real-time tracking of surgical instruments based on spatio-temporal context and deep learning. Comput Assisted Surg 24(sup1): 20\u201329","DOI":"10.1080\/24699322.2018.1560097"},{"key":"6368_CR48","doi-asserted-by":"crossref","unstructured":"Zheng H, Fu J, Tao M, Luo J (2017) Learning multi-attention convolutional neural network for fine-grained image recognition. IEEE International Conference on Computer Vision (ICCV) pp. 5219\u20135227","DOI":"10.1109\/ICCV.2017.557"},{"key":"6368_CR49","doi-asserted-by":"crossref","unstructured":"Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2016) Learning deep features for discriminative localization. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 2921\u20132929","DOI":"10.1109\/CVPR.2016.319"},{"key":"6368_CR50","doi-asserted-by":"crossref","unstructured":"Zhou Y, He L, Huang Y, Chen S, Wu P, Ye W, Liu Z, Liang C (2017) Ct-based radiomics signature: a potential biomarker for preoperative prediction of early recurrence in hepatocellular carcinoma. Abdominal Radiology 42(6): 1695\u20131704","DOI":"10.1007\/s00261-017-1072-0"}],"container-title":["Neural Computing and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00521-021-06368-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s00521-021-06368-x\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00521-021-06368-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,1,21]],"date-time":"2022-01-21T15:37:54Z","timestamp":1642779474000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s00521-021-06368-x"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,9,13]]},"references-count":50,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2022,1]]}},"alternative-id":["6368"],"URL":"https:\/\/doi.org\/10.1007\/s00521-021-06368-x","relation":{},"ISSN":["0941-0643","1433-3058"],"issn-type":[{"value":"0941-0643","type":"print"},{"value":"1433-3058","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,9,13]]},"assertion":[{"value":"20 March 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"26 July 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"13 September 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}