{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,20]],"date-time":"2026-01-20T03:21:28Z","timestamp":1768879288993,"version":"3.49.0"},"reference-count":50,"publisher":"MDPI AG","issue":"14","license":[{"start":{"date-parts":[[2022,7,11]],"date-time":"2022-07-11T00:00:00Z","timestamp":1657497600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Spanish Ministry of Science, Innovation and Universities","award":["PID2019-111023RB-C33"],"award-info":[{"award-number":["PID2019-111023RB-C33"]}]},{"name":"University of Valladolid","award":["PID2019-111023RB-C33"],"award-info":[{"award-number":["PID2019-111023RB-C33"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Medical instruments detection in laparoscopic video has been carried out to increase the autonomy of surgical robots, evaluate skills or index recordings. However, it has not been extended to surgical gauzes. Gauzes can provide valuable information to numerous tasks in the operating room, but the lack of an annotated dataset has hampered its research. In this article, we present a segmentation dataset with 4003 hand-labelled frames from laparoscopic video. To prove the dataset potential, we analyzed several baselines: detection using YOLOv3, coarse segmentation, and segmentation with a U-Net. Our results show that YOLOv3 can be executed in real time but provides a modest recall. Coarse segmentation presents satisfactory results but lacks inference speed. Finally, the U-Net baseline achieves a good speed-quality compromise running above 30 FPS while obtaining an IoU of 0.85. The accuracy reached by U-Net and its execution speed demonstrate that precise and real-time gauze segmentation can be achieved, training convolutional neural networks on the proposed dataset.<\/jats:p>","DOI":"10.3390\/s22145180","type":"journal-article","created":{"date-parts":[[2022,7,12]],"date-time":"2022-07-12T03:50:36Z","timestamp":1657597836000},"page":"5180","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":19,"title":["Gauze Detection and Segmentation in Minimally Invasive Surgery Video Using Convolutional Neural Networks"],"prefix":"10.3390","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7768-0185","authenticated-orcid":false,"given":"Guillermo","family":"S\u00e1nchez-Brizuela","sequence":"first","affiliation":[{"name":"Instituto de las Tecnolog\u00edas Avanzadas de la Producci\u00f3n (ITAP), Universidad de Valladolid, Paseo del Cauce 59, 47011 Valladolid, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1952-6152","authenticated-orcid":false,"given":"Francisco-Javier","family":"Santos-Criado","sequence":"additional","affiliation":[{"name":"Escuela T\u00e9cnica Superior de Ingenieros Industriales, Universidad Polit\u00e9cnica de Madrid, Calle de Jos\u00e9 Guti\u00e9rrez Abascal, 2, 28006 Madrid, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Daniel","family":"Sanz-Gobernado","sequence":"additional","affiliation":[{"name":"Instituto de las Tecnolog\u00edas Avanzadas de la Producci\u00f3n (ITAP), Universidad de Valladolid, Paseo del Cauce 59, 47011 Valladolid, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5837-3510","authenticated-orcid":false,"given":"Eusebio","family":"de la Fuente-L\u00f3pez","sequence":"additional","affiliation":[{"name":"Instituto de las Tecnolog\u00edas Avanzadas de la Producci\u00f3n (ITAP), Universidad de Valladolid, Paseo del Cauce 59, 47011 Valladolid, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2877-7300","authenticated-orcid":false,"given":"Juan-Carlos","family":"Fraile","sequence":"additional","affiliation":[{"name":"Instituto de las Tecnolog\u00edas Avanzadas de la Producci\u00f3n (ITAP), Universidad de Valladolid, Paseo del Cauce 59, 47011 Valladolid, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7731-2411","authenticated-orcid":false,"given":"Javier","family":"P\u00e9rez-Turiel","sequence":"additional","affiliation":[{"name":"Instituto de las Tecnolog\u00edas Avanzadas de la Producci\u00f3n (ITAP), Universidad de Valladolid, Paseo del Cauce 59, 47011 Valladolid, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1556-7179","authenticated-orcid":false,"given":"Ana","family":"Cisnal","sequence":"additional","affiliation":[{"name":"Instituto de las Tecnolog\u00edas Avanzadas de la Producci\u00f3n (ITAP), Universidad de Valladolid, Paseo del Cauce 59, 47011 Valladolid, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2022,7,11]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"238","DOI":"10.5662\/wjm.v5.i4.238","article-title":"Laparoscopic surgery: A qualified systematic review","volume":"5","author":"Buia","year":"2015","journal-title":"World J. Methodol."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"24","DOI":"10.1038\/s41591-018-0316-z","article-title":"A guide to deep learning in healthcare","volume":"25","author":"Esteva","year":"2019","journal-title":"Nat. Med."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"134","DOI":"10.1016\/j.bspc.2019.01.011","article-title":"A recurrent convolutional neural network approach for sensorless force estimation in robotic surgery","volume":"50","author":"Marban","year":"2019","journal-title":"Biomed. Signal Process. Control"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"2005","DOI":"10.1007\/s11548-019-01953-x","article-title":"Segmenting and classifying activities in robot-assisted surgery with recurrent neural networks","volume":"14","author":"DiPietro","year":"2019","journal-title":"Int. J. Comput. Assist. Radiol. Surg."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Castro, D., Pereira, D., Zanchettin, C., Macedo, D., and Bezerra, B.L.D. (2019, January 14\u201319). Towards Optimizing Convolutional Neural Networks for Robotic Surgery Skill Evaluation. Proceedings of the International Joint Conference on Neural Networks, Budapest, Hungary.","DOI":"10.1109\/IJCNN.2019.8852341"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1217","DOI":"10.1007\/s11548-019-01995-1","article-title":"Video-based surgical skill assessment using 3D convolutional neural networks","volume":"14","author":"Funke","year":"2019","journal-title":"Int. J. Comput. Assist. Radiol. Surg."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1097","DOI":"10.1007\/s11548-019-01956-8","article-title":"Objective assessment of intraoperative technical skill in capsulorhexis using videos of cataract surgery","volume":"14","author":"Kim","year":"2019","journal-title":"Int. J. Comput. Assist. Radiol. Surg."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1611","DOI":"10.1007\/s11548-019-02039-4","article-title":"Accurate and interpretable evaluation of surgical skills from kinematic data using fully convolutional neural networks","volume":"14","author":"Fawaz","year":"2019","journal-title":"Int. J. Comput. Assist. Radiol. Surg."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1542","DOI":"10.1109\/TMI.2017.2665671","article-title":"Detection and Localization of Robotic Tools in Robot-Assisted Surgery Videos Using Deep Neural Networks for Region Proposal and Detection","volume":"36","author":"Sarikaya","year":"2017","journal-title":"IEEE Trans. Med. Imaging"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Shvets, A.A., Rakhlin, A., Kalinin, A.A., and Iglovikov, V.I. (2019, January 17\u201320). Automatic Instrument Segmentation in Robot-Assisted Surgery using Deep Learning. Proceedings of the 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), Orlando, FL, USA.","DOI":"10.1109\/ICMLA.2018.00100"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Jo, K., Choi, Y., Choi, J., and Chung, J.W. (2019). Robust Real-Time Detection of Laparoscopic Instruments in Robot Surgery Using Convolutional Neural Networks with Motion Vector Prediction. Appl. Sci., 9.","DOI":"10.3390\/app9142865"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"181723","DOI":"10.1109\/ACCESS.2020.3028910","article-title":"Surgical Tools Detection Based on Training Sample Adaptation in Laparoscopic Videos","volume":"8","author":"Wang","year":"2020","journal-title":"IEEE Access"},{"key":"ref_13","first-page":"1323","article-title":"Content-based processing and analysis of endoscopic images and videos: A survey","volume":"77","author":"Schoeffmann","year":"2017","journal-title":"Multimedia Tools Appl."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"203","DOI":"10.1016\/j.media.2018.05.001","article-title":"Monitoring tool usage in surgery videos using boosted convolutional and recurrent neural networks","volume":"47","author":"Lamard","year":"2018","journal-title":"Med. Image Anal."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"101572","DOI":"10.1016\/j.media.2019.101572","article-title":"Multi-task recurrent convolutional network with correlation loss for surgical video analysis","volume":"59","author":"Jin","year":"2019","journal-title":"Med. Image Anal."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Primus, M.J., Schoeffmann, K., and Boszormenyi, L. (2016, January 15\u201317). Temporal segmentation of laparoscopic videos into surgical phases. Proceedings of the 2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI), Bucharest, Romania.","DOI":"10.1109\/CBMI.2016.7500249"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"e2387","DOI":"10.1002\/rcs.2387","article-title":"A deep learning framework for real-time 3D model registration in robot-assisted laparoscopic surgery","volume":"18","author":"Padovan","year":"2022","journal-title":"Int. J. Med Robot. Comput. Assist. Surg."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Fran\u00e7ois, T., Calvet, L., S\u00e8ve-D\u2019Erceville, C., Bourdel, N., and Bartoli, A. (October, January 27). Image-Based Incision Detection for Topological Intraoperative 3D Model Update in Augmented Reality Assisted Laparoscopic Surgery. Proceedings of the Medical Image Computing and Computer Assisted Intervention\u2013MICCAI 2021, Lecture Notes in Computer Science, Strasbourg, France.","DOI":"10.1007\/978-3-030-87202-1_62"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Garcia-Martinez, A., Juan, C.G., Garcia, N.M., and Sabater-Navarro, J.M. (2015, January 16\u201319). Automatic detection of surgical gauzes using Computer Vision. Proceedings of the 2015 23rd Mediterranean Conference on Control and Automation, MED 2015-Conference Proceedings, Torremolinos, Spain.","DOI":"10.1109\/MED.2015.7158835"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"De La Fuente, E., Trespaderne, F.M., Santos, L., Fraile, J.C., and Turiel, J.P. (September, January 30). Parallel computing for real time gauze detection in laparoscopy images. Proceedings of the BioSMART 2017 2nd International Conference on Bio-Engineering for Smart Technologies, Paris, France.","DOI":"10.1109\/BIOSMART.2017.8095328"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"105378","DOI":"10.1016\/j.cmpb.2020.105378","article-title":"Automatic gauze tracking in laparoscopic surgery using image texture analysis","volume":"190","author":"Marinero","year":"2020","journal-title":"Comput. Methods Programs Biomed."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"132","DOI":"10.1016\/j.aorn.2010.09.034","article-title":"Designing a Safer Process to Prevent Retained Surgical Sponges: A Healthcare Failure Mode and Effect Analysis","volume":"94","author":"Steelman","year":"2011","journal-title":"AORN J."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"20","DOI":"10.1186\/s13037-018-0166-0","article-title":"Retained surgical sponges: A descriptive study of 319 occurrences and contributing factors from 2012 to 2017","volume":"12","author":"Steelman","year":"2018","journal-title":"Patient Saf. Surg."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"4630","DOI":"10.18203\/2320-1770.ijrcog20194912","article-title":"Gossypiboma: A surgical menace","volume":"8","author":"Shah","year":"2019","journal-title":"Int. J. Reprod. Contracept. Obstet. Gynecol."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Bello-Cerezo, R., Bianconi, F., Di Maria, F., Napoletano, P., and Smeraldi, F. (2019). Comparative Evaluation of Hand-Crafted Image Descriptors vs. Off-the-Shelf CNN-Based Features for Colour Texture Classification under Ideal and Realistic Conditions. Appl. Sci., 9.","DOI":"10.3390\/app9040738"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"S\u00e1nchez-Brizuela, G., and de la Fuente L\u00f3pez, E. (2022). Dataset: Gauze detection and segmentation in minimally invasive surgery video using convolutional neural networks. Zenodo.","DOI":"10.3390\/s22145180"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You Only Look Once: Unified, Real-Time Object Detection. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_28","unstructured":"Redmon, J., and Farhadi, A. (2018). YOLOv3: An Incremental Improvement. arXiv."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna, Z. (2016, January 27\u201330). Rethinking the Inception Architecture for Computer Vision. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.308"},{"key":"ref_30","unstructured":"Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., and Adam, H. (2017). MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"640","DOI":"10.1109\/TPAMI.2016.2572683","article-title":"Fully convolutional networks for semantic segmentation","volume":"39","author":"Shelhamer","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"He, K., Gkioxari, G., Doll\u00e1r, P., and Girshick, R. (2017, January 22\u201329). Mask R-CNN. Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.322"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"1137","DOI":"10.1109\/TPAMI.2016.2577031","article-title":"Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks","volume":"39","author":"Ren","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"2481","DOI":"10.1109\/TPAMI.2016.2644615","article-title":"Segnet: A deep convolutional encoder-decoder architecture for image segmentation","volume":"39","author":"Badrinarayanan","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_36","unstructured":"Chen, L.-C., Papandreou, G., Schroff, F., and Adam, H. (2017). Rethinking Atrous Convolution for Semantic Image Segmentation. arXiv."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., and Adam, H. (2018, January 8\u201314). Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. Proceedings of the European conference on computer vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_49"},{"key":"ref_38","unstructured":"Yakubovskiy, P. (2021, February 11). Segmentation models. GitHub Repos. Available online: https:\/\/github.com\/qubvel\/segmentation_models."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., and Chen, L. (2018, January 18\u201323). MobileNetV2: Inverted residuals and linear bottlenecks. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00474"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1007\/s11263-015-0816-y","article-title":"ImageNet Large Scale Visual Recognition Challenge","volume":"115","author":"Russakovsky","year":"2015","journal-title":"Int. J. Comput. Vis."},{"key":"ref_41","unstructured":"Kingma, D.P., and Ba, J.L. (2015, January 7\u20139). Adam: A method for stochastic optimization. Proceedings of the 3rd International Conference for Learning Representations ICLR 2015, San Diego, CA, USA."},{"key":"ref_42","first-page":"240","article-title":"Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations","volume":"Volume 10553","author":"Sudre","year":"2017","journal-title":"Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, Proceedings of the Third International Workshop, DLMIA 2017, and 7th International Workshop, ML-CDS 2017, Held in Conjunction with MICCAI 2017, Qu\u00e9bec City, QC, Canada, 14 September 2017"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"86","DOI":"10.1109\/TMI.2016.2593957","article-title":"EndoNet: A Deep Architecture for Recognition Tasks on Laparoscopic Videos","volume":"36","author":"Twinanda","year":"2016","journal-title":"IEEE Trans. Med. Imaging"},{"key":"ref_44","unstructured":"Hong, W.-Y., Kao, C.-L., Kuo, Y.-H., Wang, J.-R., Chang, W.-L., and Shih, C.-S. (2012). CholecSeg8k: A Semantic Segmentation Dataset for Laparoscopic Cholecystectomy Based on Cholec80. arXiv."},{"key":"ref_45","first-page":"438","article-title":"Can masses of non-experts train highly accurate image classifiers? A crowdsourcing approach to instrument segmentation in laparoscopic images","volume":"17","author":"Mersmann","year":"2014","journal-title":"Med. Image Comput. Comput. Assist. Interv."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Leibetseder, A., Petscharnig, S., Primus, M.J., Kletz, S., M\u00fcnzer, B., Schoeffmann, K., and Keckstein, J. (2018, January 12\u201315). LapGyn4: A dataset for 4 automatic content analysis problems in the domain of laparoscopic gynecology. Proceedings of the 9th ACM Multimedia Systems Conference MMSys 2018, New York, NY, USA.","DOI":"10.1145\/3204949.3208127"},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"5377","DOI":"10.1007\/s00464-019-07330-8","article-title":"SurgAI: Deep learning for computerized laparoscopic image understanding in gynaecology","volume":"34","author":"Zadeh","year":"2020","journal-title":"Surg. Endosc."},{"key":"ref_48","unstructured":"Stauder, R., Ostler, D., Kranzfelder, M., Koller, S., Feu\u00dfner, H., and Navab, N. (2020, December 15). The TUM LapChole Dataset for the M2CAI 2016 Workflow Challenge. Available online: http:\/\/arxiv.org\/abs\/1610.09278."},{"key":"ref_49","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2021, June 16). An Image is Worth 16 \u00d7 16 Words: Transformers for Image Recognition at Scale. Available online: http:\/\/arxiv.org\/abs\/2010.11929."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., and Zagoruyko, S. (2020, January 23\u201328). End-to-End Object Detection with Transformers. Proceedings of the BT-Computer Vision\u2013ECCV 2020, Glasgow, UK.","DOI":"10.1007\/978-3-030-58452-8_13"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/14\/5180\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T23:47:55Z","timestamp":1760140075000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/14\/5180"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,7,11]]},"references-count":50,"journal-issue":{"issue":"14","published-online":{"date-parts":[[2022,7]]}},"alternative-id":["s22145180"],"URL":"https:\/\/doi.org\/10.3390\/s22145180","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,7,11]]}}}