{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,8]],"date-time":"2026-07-08T17:05:53Z","timestamp":1783530353703,"version":"3.55.0"},"reference-count":28,"publisher":"Springer Science and Business Media LLC","issue":"5","license":[{"start":{"date-parts":[[2024,1,8]],"date-time":"2024-01-08T00:00:00Z","timestamp":1704672000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,1,8]],"date-time":"2024-01-08T00:00:00Z","timestamp":1704672000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"Haag-Streit Foundation, Switzerland"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int J CARS"],"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                <jats:title>Purpose<\/jats:title>\n                <jats:p>Semantic segmentation plays a pivotal role in many applications related to medical image and video analysis. However, designing a neural network architecture for medical image and surgical video segmentation is challenging due to the diverse features of relevant classes, including heterogeneity, deformability, transparency, blunt boundaries, and various distortions. We propose a network architecture, DeepPyramid+, which addresses diverse challenges encountered in medical image and surgical video segmentation.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Methods<\/jats:title>\n                <jats:p>The proposed DeepPyramid+ incorporates two major modules, namely \u201cPyramid View Fusion\u201d (PVF) and \u201cDeformable Pyramid Reception\u201d (DPR), to address the outlined challenges. PVF replicates a deduction process within the neural network, aligning with the human visual system, thereby enhancing the representation of relative information at each pixel position. Complementarily, DPR introduces shape- and scale-adaptive feature extraction techniques using dilated deformable convolutions, enhancing accuracy and robustness in handling heterogeneous classes and deformable shapes.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Results<\/jats:title>\n                <jats:p>Extensive experiments conducted on diverse datasets, including endometriosis videos, MRI images, OCT scans, and cataract and laparoscopy videos, demonstrate the effectiveness of DeepPyramid+ in handling various challenges such as shape and scale variation, reflection, and blur degradation. DeepPyramid+ demonstrates significant improvements in segmentation performance, achieving up to a 3.65% increase in Dice coefficient for intra-domain segmentation and up to a 17% increase in Dice coefficient for cross-domain segmentation.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Conclusions<\/jats:title>\n                <jats:p>DeepPyramid+ consistently outperforms state-of-the-art networks across diverse modalities considering different backbone networks, showcasing its versatility. Accordingly, DeepPyramid+ emerges as a robust and effective solution, successfully overcoming the intricate challenges associated with relevant content segmentation in medical images and surgical videos. Its consistent performance and adaptability indicate its potential to enhance precision in computerized medical image and surgical video analysis applications.<\/jats:p>\n              <\/jats:sec>","DOI":"10.1007\/s11548-023-03046-2","type":"journal-article","created":{"date-parts":[[2024,1,8]],"date-time":"2024-01-08T14:02:10Z","timestamp":1704722530000},"page":"851-859","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["DeepPyramid+: medical image segmentation using Pyramid View Fusion and Deformable Pyramid Reception"],"prefix":"10.1007","volume":"19","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0908-8972","authenticated-orcid":false,"given":"Negin","family":"Ghamsarian","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Sebastian","family":"Wolf","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Martin","family":"Zinkernagel","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Klaus","family":"Schoeffmann","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Raphael","family":"Sznitman","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2024,1,8]]},"reference":[{"key":"3046_CR1","doi-asserted-by":"crossref","unstructured":"Ghamsarian N, Taschwer M, Putzgruber-Adamitsch D, Sarny S, Schoeffmann K (2021) Relevance detection in cataract surgery videos by spatio-temporal action localization. In: 2020 25th International conference on pattern recognition (ICPR), pp 10720\u201310727","DOI":"10.1109\/ICPR48806.2021.9412525"},{"key":"3046_CR2","doi-asserted-by":"crossref","unstructured":"Ghamsarian N (2020) Enabling relevance-based exploration of cataract videos. In: Proceedings of the 2020 international conference on multimedia retrieval, pp 378\u2013382","DOI":"10.1145\/3372278.3391937"},{"key":"3046_CR3","doi-asserted-by":"crossref","unstructured":"Ghamsarian N, Amirpourazarian H, Timmerer C, Taschwer M, Sch\u00f6ffmann K (2020) Relevance-based compression of cataract surgery videos using convolutional neural networks. In: Proceedings of the 28th ACM international conference on multimedia, pp 3577\u20133585","DOI":"10.1145\/3394171.3413658"},{"key":"3046_CR4","doi-asserted-by":"crossref","unstructured":"Ghamsarian N, Taschwer M, Putzgruber-Adamitsch D, Sarny S, El-Shabrawi Y, Schoeffmann K (2021) LensID: a CNN-RNN-based framework towards lens irregularity detection in cataract surgery videos. In: Medical image computing and computer assisted intervention\u2014MICCAI 2021: 24th international conference, Strasbourg, France, September 27\u2013October 1, 2021, Proceedings, Part VIII 24. Springer, pp 76\u201386","DOI":"10.1007\/978-3-030-87237-3_8"},{"key":"3046_CR5","doi-asserted-by":"publisher","DOI":"10.3389\/fendo.2022.946915","volume":"13","author":"X Huang","year":"2022","unstructured":"Huang X, Wang H, She C, Feng J, Liu X, Hu X, Chen L, Tao Y (2022) Artificial intelligence promotes the diagnosis and screening of diabetic retinopathy. Front Endocrinol 13:946915","journal-title":"Front Endocrinol"},{"key":"3046_CR6","doi-asserted-by":"crossref","unstructured":"Ghamsarian N, Taschwer M, Sznitman R, Schoeffmann K (2022) Deeppyramid: Enabling pyramid view and deformable pyramid reception for semantic segmentation in cataract surgery videos. In: Medical image computing and computer assisted intervention\u2014MICCAI 2022: 25th international conference, Singapore, September 18\u201322, 2022, Proceedings, Part V. Springer, pp 276\u2013286","DOI":"10.1007\/978-3-031-16443-9_27"},{"key":"3046_CR7","first-page":"234","volume-title":"Medical Image Computing and Computer-Assisted Intervention\u2014MICCAI 2015","author":"O Ronneberger","year":"2015","unstructured":"Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: Navab N, Hornegger J, Wells WM, Frangi AF (eds) Medical Image Computing and Computer-Assisted Intervention\u2014MICCAI 2015. Springer, Cham, pp 234\u2013241"},{"key":"3046_CR8","doi-asserted-by":"crossref","unstructured":"Chen X, Zhang R, Yan P (2019) Feature fusion encoder decoder network for automatic liver lesion segmentation. In: 2019 IEEE 16th international symposium on biomedical imaging (ISBI 2019), pp 430\u2013433","DOI":"10.1109\/ISBI.2019.8759555"},{"key":"3046_CR9","doi-asserted-by":"publisher","first-page":"139","DOI":"10.1007\/978-3-030-36711-4_13","volume-title":"Neural Information Processing","author":"Z-L Ni","year":"2019","unstructured":"Ni Z-L, Bian G-B, Zhou X-H, Hou Z-G, Xie X-L, Wang C, Zhou Y-J, Li R-Q, Li Z (2019) Raunet: Residual attention u-net for semantic segmentation of cataract surgical instruments. In: Gedeon T, Wong KW, Lee M (eds) Neural Information Processing. Springer, Cham, pp 139\u2013149"},{"issue":"10","key":"3046_CR10","doi-asserted-by":"publisher","first-page":"2281","DOI":"10.1109\/TMI.2019.2903562","volume":"38","author":"Z Gu","year":"2019","unstructured":"Gu Z, Cheng J, Fu H, Zhou K, Hao H, Zhao Y, Zhang T, Gao S, Liu J (2019) Ce-net: Context encoder network for 2d medical image segmentation. IEEE Trans Med Imaging 38(10):2281\u20132292","journal-title":"IEEE Trans Med Imaging"},{"issue":"07","key":"3046_CR11","first-page":"11782","volume":"34","author":"Z-L Ni","year":"2020","unstructured":"Ni Z-L, Bian G-B, Wang G-A, Zhou X-H, Hou Z-G, Chen H-B, Xie X-L (2020) Pyramid attention aggregation network for semantic segmentation of surgical instruments. Proc AAAI Conf Artif Intell 34(07):11782\u201311790","journal-title":"Proc AAAI Conf Artif Intell"},{"key":"3046_CR12","doi-asserted-by":"crossref","unstructured":"Ni Z-L, Bian G-B, Wang G-A, Zhou X-H, Hou Z-G, Xie X-L, Li Z, Wang Y-H (2021) Barnet: bilinear attention network with adaptive receptive fields for surgical instrument segmentation. In: Proceedings of the twenty-ninth international conference on international joint conferences on artificial intelligence, pp 832\u2013838","DOI":"10.24963\/ijcai.2020\/116"},{"issue":"10","key":"3046_CR13","doi-asserted-by":"publisher","first-page":"3008","DOI":"10.1109\/TMI.2020.2983721","volume":"39","author":"S Feng","year":"2020","unstructured":"Feng S, Zhao H, Shi F, Cheng X, Wang M, Ma Y, Xiang D, Zhu W, Chen X (2020) CPFNet: Context pyramid fusion network for medical image segmentation. IEEE Trans Med Imaging 39(10):3008\u20133018","journal-title":"IEEE Trans Med Imaging"},{"issue":"6","key":"3046_CR14","doi-asserted-by":"publisher","first-page":"1856","DOI":"10.1109\/TMI.2019.2959609","volume":"39","author":"Z Zhou","year":"2020","unstructured":"Zhou Z, Siddiquee MMR, Tajbakhsh N, Liang J (2020) Unet++: Redesigning skip connections to exploit multiscale features in image segmentation. IEEE Trans Med Imaging 39(6):1856\u20131867","journal-title":"IEEE Trans Med Imaging"},{"issue":"2","key":"3046_CR15","doi-asserted-by":"publisher","first-page":"540","DOI":"10.1109\/TMI.2018.2867261","volume":"38","author":"AG Roy","year":"2019","unstructured":"Roy AG, Navab N, Wachinger C (2019) Recalibrating fully convolutional networks with spatial and channel \u201csqueeze and excitation\" blocks. IEEE Trans Med Imaging 38(2):540\u2013549","journal-title":"IEEE Trans Med Imaging"},{"key":"3046_CR16","doi-asserted-by":"crossref","unstructured":"Ghamsarian N, Taschwer M, Putzgruber-Adamitsch D, Sarny S, El-Shabrawi Y, Sch\u00f6ffmann K (2021) Recal-net: Joint region-channel-wise calibrated network for semantic segmentation in cataract surgery videos. In: Neural information processing: 28th international conference, ICONIP 2021, Sanur, Bali, Indonesia, December 8\u201312, 2021, Proceedings, Part III 28. Springer, pp 391\u2013402","DOI":"10.1007\/978-3-030-92238-2_33"},{"key":"3046_CR17","doi-asserted-by":"crossref","unstructured":"Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)","DOI":"10.1109\/CVPR.2017.660"},{"issue":"4","key":"3046_CR18","doi-asserted-by":"publisher","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","volume":"40","author":"L-C Chen","year":"2018","unstructured":"Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834\u2013848","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"3046_CR19","doi-asserted-by":"crossref","unstructured":"Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision (ECCV)","DOI":"10.1007\/978-3-030-01234-2_49"},{"key":"3046_CR20","unstructured":"Ghamsarian N, El-Shabrawi Y, Nasirihaghighi S, Putzgruber-Adamitsch D, Zinkernagel M, Wolf S, Schoeffmann K, Sznitman R (2023) Cataract-1K: cataract surgery dataset for scene segmentation, phase recognition, and irregularity detection. arXiv preprint https:\/\/arxiv.org\/abs\/2312.06295"},{"key":"3046_CR21","unstructured":"Bodenstedt S, Speidel S, Allan M, Stoyanov D, Maier-Hein L, Kenngott H, Wagner M (2015) Multi-instrument EndoVis challenge dataset. https:\/\/endovissub-instrument.grand-challenge.org\/"},{"issue":"5","key":"3046_CR22","doi-asserted-by":"publisher","first-page":"6191","DOI":"10.1007\/s11042-021-11730-1","volume":"81","author":"A Leibetseder","year":"2022","unstructured":"Leibetseder A, Schoeffmann K, Keckstein J, Keckstein S (2022) Endometriosis detection and localization in laparoscopic gynecology. Multimed Tools Appl 81(5):6191\u20136215","journal-title":"Multimed Tools Appl"},{"key":"3046_CR23","doi-asserted-by":"crossref","unstructured":"Liu Q, Dou Q, Yu L, Heng PA (2020) MS-Net: multi-site network for improving prostate segmentation with heterogeneous MRI data. IEEE Trans Med Imaging","DOI":"10.1109\/TMI.2020.2974574"},{"issue":"8","key":"3046_CR24","doi-asserted-by":"publisher","first-page":"1858","DOI":"10.1109\/TMI.2019.2901398","volume":"38","author":"H Bogunovic","year":"2019","unstructured":"Bogunovic H, Venhuizen F, Klimscha S, Apostolopoulos S, Bab-Hadiashar A, Bagci U, Beg MF, Bekalo L, Chen Q, Ciller C, Gopinath K, Gostar AK, Jeon K, Ji Z, Kang SH, Koozekanani DD, Lu D, Morley D, Parhi KK, Park HS, Rashno A, Sarunic M, Shaikh S, Sivaswamy J, Tennakoon R, Yadav S, De Zanet S, Waldstein SM, Gerendas BS, Klaver C, S\u00e1nchez CI, Schmidt-Erfurth U (2019) Retouch: the retinal oct fluid detection and segmentation benchmark and challenge. IEEE Trans Med Imaging 38(8):1858\u20131874","journal-title":"IEEE Trans Med Imaging"},{"key":"3046_CR25","doi-asserted-by":"publisher","DOI":"10.1016\/j.media.2021.102053","volume":"71","author":"M Grammatikopoulou","year":"2021","unstructured":"Grammatikopoulou M, Flouty E, Kadkhodamohammadi A, Quellec G, Chow A, Nehme J, Luengo I, Stoyanov D (2021) CaDIS: Cataract dataset for surgical RGB-image segmentation. Med Image Anal 71:102053","journal-title":"Med Image Anal"},{"key":"3046_CR26","doi-asserted-by":"crossref","unstructured":"Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 801\u2013818","DOI":"10.1007\/978-3-030-01234-2_49"},{"key":"3046_CR27","doi-asserted-by":"crossref","unstructured":"Xiao T, Liu Y, Zhou B, Jiang Y, Sun J (2018) Unified perceptual parsing for scene understanding. In: Proceedings of the European conference on computer vision (ECCV), pp 418\u2013434","DOI":"10.1007\/978-3-030-01228-1_26"},{"key":"3046_CR28","doi-asserted-by":"crossref","unstructured":"Ghamsarian N, Gamazo\u00a0Tejero J, M\u00e1rquez-Neila P, Wolf S, Zinkernagel M, Schoeffmann K, Sznitman R (2023) Domain adaptation for medical image segmentation using transformation-invariant self-training. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 331\u2013341","DOI":"10.1007\/978-3-031-43907-0_32"}],"container-title":["International Journal of Computer Assisted Radiology and Surgery"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11548-023-03046-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11548-023-03046-2\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11548-023-03046-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,11,23]],"date-time":"2024-11-23T12:06:09Z","timestamp":1732363569000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11548-023-03046-2"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,1,8]]},"references-count":28,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2024,5]]}},"alternative-id":["3046"],"URL":"https:\/\/doi.org\/10.1007\/s11548-023-03046-2","relation":{},"ISSN":["1861-6429"],"issn-type":[{"value":"1861-6429","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,1,8]]},"assertion":[{"value":"31 May 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 December 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 January 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}},{"value":"For this type of study, formal consent is not required.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethical approval"}},{"value":"This article uses patient data from publicly available datasets.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Informed consent"}}]}}