{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T01:28:20Z","timestamp":1760146100303,"version":"build-2065373602"},"reference-count":46,"publisher":"MDPI AG","issue":"19","license":[{"start":{"date-parts":[[2024,9,30]],"date-time":"2024-09-30T00:00:00Z","timestamp":1727654400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Key Research and Development Program of Shaanxi","award":["2024GX-ZDCYL-02-15"],"award-info":[{"award-number":["2024GX-ZDCYL-02-15"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Interactive image segmentation extremely accelerates the generation of high-quality annotation image datasets, which are the pillars of the applications of deep learning. However, these methods suffer from the insignificance of interaction information and excessively high optimization costs, resulting in unexpected segmentation outcomes and increased computational burden. To address these issues, this paper focuses on interactive information mining from the network architecture and optimization procedure. In terms of network architecture, the issue mentioned above arises from two perspectives: the less representative feature of interactive regions in each layer and the interactive information weakened by the network hierarchy structure. Therefore, the paper proposes a network called EnNet. The network addresses the two aforementioned issues by employing attention mechanisms to integrate user interaction information across the entire image and incorporating interaction information twice in a design that progresses from coarse to fine. In terms of optimization, this paper proposes a method of using zero-order optimization during the first four iterations of training. This approach can reduce computational overhead with only a minimal reduction in accuracy. The experimental results on GrabCut, Berkeley, DAVIS, and SBD datasets validate the effectiveness of the proposed method, with our approach achieving an average NOC@90 that surpasses RITM by 0.35.<\/jats:p>","DOI":"10.3390\/s24196361","type":"journal-article","created":{"date-parts":[[2024,9,30]],"date-time":"2024-09-30T12:06:32Z","timestamp":1727697992000},"page":"6361","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["EnNet: Enhanced Interactive Information Network with Zero-Order Optimization"],"prefix":"10.3390","volume":"24","author":[{"given":"Yingzhao","family":"Shao","sequence":"first","affiliation":[{"name":"State Key Laboratory of Integrated Services Networks, Xidian University, Xi\u2019an 710071, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yanxin","family":"Chen","sequence":"additional","affiliation":[{"name":"Key Laboratory of Smart Human-Computer Interaction and Wearable Technology of Shaanxi Province, Xi\u2019an 710071, China"},{"name":"School of Computer Science and Technology, Xidian University, Xi\u2019an 710126, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Pengfei","family":"Yang","sequence":"additional","affiliation":[{"name":"Key Laboratory of Smart Human-Computer Interaction and Wearable Technology of Shaanxi Province, Xi\u2019an 710071, China"},{"name":"School of Computer Science and Technology, Xidian University, Xi\u2019an 710126, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Fei","family":"Cheng","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Xidian University, Xi\u2019an 710126, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2024,9,30]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1016\/j.neucom.2021.08.105","article-title":"Lane-deeplab: Lane semantic segmentation in automatic driving scenarios for high-definition maps","volume":"465","author":"Li","year":"2021","journal-title":"Neurocomputing"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"115","DOI":"10.1016\/j.isprsjprs.2019.02.009","article-title":"Segmentation for Object-Based Image Analysis (OBIA): A review of algorithms and challenges from remote sensing perspective","volume":"150","author":"Hossain","year":"2019","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_3","unstructured":"Cheng, Z., Wu, Y., Xu, Z., Lukasiewicz, T., and Wang, W. (2019). Segmentation is all you need. arXiv."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3391743","article-title":"Video object segmentation and tracking: A survey","volume":"11","author":"Yao","year":"2020","journal-title":"ACM Trans. Intell. Syst. Technol. (TIST)"},{"key":"ref_5","unstructured":"Jing, F., Li, M., Zhang, H.J., and Zhang, B. (2003, January 25\u201328). Unsupervised image segmentation using local homogeneity analysis. Proceedings of the 2003 IEEE International Symposium on Circuits and Systems (ISCAS), Bangkok, Thailand."},{"key":"ref_6","unstructured":"Hooker, S., Erhan, D., Kindermans, P.J., and Kim, B. (2019, January 8\u201314). A benchmark for interpretability methods in deep neural networks. Proceedings of the Advances in Neural Information Processing Systems 32 (NeurIPS 2019), Vancouver, BC, Canada."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Lin, Z., Zhang, Z., Chen, L.Z., Cheng, M.M., and Lu, S.P. (2020, January 13\u201319). Interactive image segmentation with first click attention. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01335"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Jacobsen, J.H., Van Gemert, J., Lou, Z., and Smeulders, A.W. (2016, January 27\u201330). Structured receptive fields in cnns. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.286"},{"key":"ref_9","unstructured":"Luo, W., Li, Y., Urtasun, R., and Zemel, R. (2016, January 5\u201310). Understanding the effective receptive field in deep convolutional neural networks. Proceedings of the 30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain."},{"key":"ref_10","unstructured":"Cheng, T., Wang, X., Huang, L., and Liu, W. (2020, January 23\u201328). Boundary-preserving mask r-cnn. Proceedings of the Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK. Part XIV 16."},{"key":"ref_11","unstructured":"Marin, D., He, Z., Vajda, P., Chatterjee, P., Tsai, S., Yang, F., and Boykov, Y. (November, January 27). Efficient segmentation: Learning downsampling near semantic boundaries. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Republic of Korea."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Jang, W.D., and Kim, C.S. (2019, January 15\u201320). Interactive image segmentation via backpropagating refinement scheme. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00544"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Sofiiuk, K., Petrov, I., Barinova, O., and Konushin, A. (2020, January 13\u201319). f-brs: Rethinking backpropagating refinement for interactive segmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00865"},{"key":"ref_14","unstructured":"Boykov, Y.Y., and Jolly, M.P. (2001, January 7\u201314). Interactive graph cuts for optimal boundary & region segmentation of objects in ND images. Proceedings of the Eighth IEEE International Conference on Computer Vision. ICCV 2001, Vancouver, BC, Canada."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"309","DOI":"10.1145\/1015706.1015720","article-title":"\u201cGrabCut\u201d interactive foreground extraction using iterated graph cuts","volume":"23","author":"Rother","year":"2004","journal-title":"ACM Trans. Graph. (TOG)"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"3734","DOI":"10.1109\/TIP.2012.2191566","article-title":"Robust interactive image segmentation using convex active contours","volume":"21","author":"Nguyen","year":"2012","journal-title":"IEEE Trans. Image Process."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"469","DOI":"10.1109\/TPAMI.2006.57","article-title":"Isoperimetric graph partitioning for image segmentation","volume":"28","author":"Grady","year":"2006","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1016\/j.cviu.2013.10.012","article-title":"Random walks in directed hypergraphs and application to semi-supervised image segmentation","volume":"120","author":"Ducournau","year":"2014","journal-title":"Comput. Vis. Image Underst."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"641","DOI":"10.1109\/34.295913","article-title":"Seeded region growing","volume":"16","author":"Adams","year":"1994","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Qin, H., Zain, J.M., Ma, X., and Hai, T. (2010, January 10\u201312). Scene segmentation based on seeded region growing for foreground detection. Proceedings of the 2010 Sixth International Conference on Natural Computation, Yantai, China.","DOI":"10.1109\/ICNC.2010.5584032"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"349","DOI":"10.1006\/gmip.1998.0480","article-title":"Interactive segmentation with intelligent scissors","volume":"60","author":"Mortensen","year":"1998","journal-title":"Graph. Model. Image Process."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Xu, N., Price, B., Cohen, S., Yang, J., and Huang, T.S. (2016, January 27\u201330). Deep interactive object selection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.47"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Li, Z., Chen, Q., and Koltun, V. (2018, January 18\u201323). Interactive image segmentation with latent diversity. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00067"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"31","DOI":"10.1016\/j.neunet.2018.10.009","article-title":"A fully convolutional two-stream fusion network for interactive image segmentation","volume":"109","author":"Hu","year":"2019","journal-title":"Neural Netw."},{"key":"ref_25","unstructured":"Mahadevan, S., Voigtlaender, P., and Leibe, B. (2018). Iteratively trained interactive segmentation. arXiv."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Liew, J., Wei, Y., Xiong, W., Ong, S.H., and Feng, J. (2017, January 22\u201329). Regional interactive image segmentation networks. Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.297"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Lin, Z., Duan, Z.P., Zhang, Z., Guo, C.L., and Cheng, M.M. (2022, January 18\u201324). Focuscut: Diving into a focus view in interactive segmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00266"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Hao, Y., Liu, Y., Wu, Z., Han, L., Chen, Y., Chen, G., Chu, L., Tang, S., Yu, Z., and Chen, Z. (2021, January 20\u201325). Edgeflow: Achieving practical interactive segmentation with edge-guided flow. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Nashville, TN, USA.","DOI":"10.1109\/ICCVW54120.2021.00180"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Maninis, K.K., Caelles, S., Pont-Tuset, J., and Van Gool, L. (2018, January 18\u201323). Deep extreme cut: From extreme points to object segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00071"},{"key":"ref_30","unstructured":"Zhang, S., Liew, J.H., Wei, Y., Wei, S., and Zhao, Y. (2022, January 18\u201324). Interactive object segmentation with inside-outside guidance. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Castrejon, L., Kundu, K., Urtasun, R., and Fidler, S. (2017, January 21\u201326). Annotating object instances with a polygon-rnn. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.477"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Acuna, D., Ling, H., Kar, A., and Fidler, S. (2018, January 18\u201323). Efficient interactive annotation of segmentation datasets with polygon-rnn++. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00096"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Chen, X., Zhao, Z., Yu, F., Zhang, Y., and Duan, M. (2021, January 10\u201317). Conditional diffusion for interactive segmentation. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00725"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Zheng, E., Yu, Q., Li, R., Shi, P., and Haake, A. (2021, January 2\u20139). A continual learning framework for uncertainty-aware interactive image segmentation. Proceedings of the AAAI Conference on Artificial Intelligence, Virtual.","DOI":"10.1609\/aaai.v35i7.16752"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Liew, J.H., Cohen, S., Price, B., Mai, L., and Feng, J. (2021, January 5\u20139). Deep interactive thin object selection. Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision, Waikola, HI, USA.","DOI":"10.1109\/WACV48630.2021.00035"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Liu, Q., Zheng, M., Planche, B., Karanam, S., Chen, T., Niethammer, M., and Wu, Z. (2022, January 23\u201327). PseudoClick: Interactive image segmentation with click imitation. Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel.","DOI":"10.1007\/978-3-031-20068-7_42"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Sofiiuk, K., Petrov, I.A., and Konushin, A. (2022, January 16\u201319). Reviving iterative training with mask guidance for interactive segmentation. Proceedings of the 2022 IEEE International Conference on Image Processing (ICIP), Bordeaux, France.","DOI":"10.1109\/ICIP46576.2022.9897365"},{"key":"ref_38","unstructured":"Chen, L.C., Papandreou, G., Schroff, F., and Adam, H. (2017). Rethinking atrous convolution for semantic image segmentation. arXiv."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Kirillov, A., Wu, Y., He, K., and Girshick, R. (2020, January 13\u201319). Pointrend: Image segmentation as rendering. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00982"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Chen, P.Y., Zhang, H., Sharma, Y., Yi, J., and Hsieh, C.J. (2017, January 3). Zoo: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models. Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, Dallas, TX, USA.","DOI":"10.1145\/3128572.3140448"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., and Zitnick, C.L. (2014, January 6\u201312). Microsoft coco: Common objects in context. Proceedings of the Computer Vision\u2013ECCV 2014: 13th European Conference, Zurich, Switzerland. Proceedings, Part V 13.","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Gupta, A., Dollar, P., and Girshick, R. (2019, January 15\u201320). Lvis: A dataset for large vocabulary instance segmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00550"},{"key":"ref_44","unstructured":"Martin, D., Fowlkes, C., Tal, D., and Malik, J. (2001, January 7\u201314). A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. Proceedings of the Eighth IEEE International Conference on Computer Vision. ICCV, Vancouver, BC, Canada."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Perazzi, F., Pont-Tuset, J., McWilliams, B., Van Gool, L., Gross, M., and Sorkine-Hornung, A. (2016, January 27\u201330). A benchmark dataset and evaluation methodology for video object segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.85"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Hariharan, B., Arbel\u00e1ez, P., Bourdev, L., Maji, S., and Malik, J. (2021, January 10\u201317). Semantic contours from inverse detectors. Proceedings of the 2011 International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV.2011.6126343"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/24\/19\/6361\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T16:08:07Z","timestamp":1760112487000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/24\/19\/6361"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,9,30]]},"references-count":46,"journal-issue":{"issue":"19","published-online":{"date-parts":[[2024,10]]}},"alternative-id":["s24196361"],"URL":"https:\/\/doi.org\/10.3390\/s24196361","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2024,9,30]]}}}