{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,30]],"date-time":"2026-04-30T16:42:22Z","timestamp":1777567342315,"version":"3.51.4"},"reference-count":25,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2024,2,4]],"date-time":"2024-02-04T00:00:00Z","timestamp":1707004800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,2,4]],"date-time":"2024-02-04T00:00:00Z","timestamp":1707004800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61871182 and 61773160"],"award-info":[{"award-number":["61871182 and 61773160"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62371188"],"award-info":[{"award-number":["62371188"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Natural Science Foundation of Hebei Province of China","award":["F2021502013"],"award-info":[{"award-number":["F2021502013"]}]},{"DOI":"10.13039\/501100012226","name":"Fundamental Research Funds for the Central Universities","doi-asserted-by":"publisher","award":["2020MS153 and 2021PT018"],"award-info":[{"award-number":["2020MS153 and 2021PT018"]}],"id":[{"id":"10.13039\/501100012226","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Vis. Intell."],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Due to the lack of annotations in target bounding boxes, most methods for weakly supervised target detection transform the problem of object detection into a classification problem of candidate regions, making it easy for weakly supervised target detectors to locate significant and highly discriminative local areas of objects. We propose a weak monitoring method that combines attention and erasure mechanisms. The supervised target detection method uses attention maps to search for areas with higher discrimination within candidate regions, and then uses an erasure mechanism to erase the region, forcing the model to enhance its learning of features in areas with weaker discrimination. To improve the positioning ability of the detector, we cascade the weakly supervised target detection network and the fully supervised target detection network, and jointly train the weakly supervised target detection network and the fully supervised target detection network through multi-task learning. Based on the validation trials, the category mean average precision (mAP) and the correct localization (CorLoc) on the two datasets, i.e., VOC2007 and VOC2012, are 55.2% and 53.8%, respectively. In regard to the mAP and CorLoc, this approach significantly outperforms previous approaches, which creates opportunities for additional investigations into weakly supervised target identification algorithms.<\/jats:p>","DOI":"10.1007\/s44267-024-00037-y","type":"journal-article","created":{"date-parts":[[2024,2,4]],"date-time":"2024-02-04T04:31:00Z","timestamp":1707021060000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":17,"title":["Weakly supervised target detection based on spatial attention"],"prefix":"10.1007","volume":"2","author":[{"given":"Wenqing","family":"Zhao","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lijiao","family":"Xu","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2024,2,4]]},"reference":[{"key":"37_CR1","doi-asserted-by":"publisher","first-page":"111","DOI":"10.1016\/B978-0-12-820125-1.00017-8","volume-title":"Biosignal processing and classification using computational learning and intelligence: principles, algorithms, and applications","author":"E. F. Morales","year":"2022","unstructured":"Morales, E. F., & Escalante, H. J. (2022). A brief introduction to supervised, unsupervised, and reinforcement learning. In A. A. Torres-Garc\u00eda, C. A. Reyes-Garc\u00eda, L. Villase\u00f1or-Pineda, et al. (Eds.), Biosignal processing and classification using computational learning and intelligence: principles, algorithms, and applications (pp. 111\u2013129). New York: Academic Press."},{"issue":"6","key":"37_CR2","doi-asserted-by":"crossref","first-page":"1768","DOI":"10.11834\/jig.220178","volume":"27","author":"D. Ren","year":"2022","unstructured":"Ren, D., Wang, Q., Wei, Y., Meng, D., & Zuo, W. (2022). Progress in weakly supervised learning for visual understanding. International Journal of Image and Graphics, 27(6), 1768\u20131798.","journal-title":"International Journal of Image and Graphics"},{"key":"37_CR3","doi-asserted-by":"publisher","first-page":"192","DOI":"10.1016\/j.neucom.2022.01.095","volume":"496","author":"F. Shao","year":"2022","unstructured":"Shao, F., Chen, L., Shao, J., Ji, W., Xiao, S., Ye, L., et al. (2022). Deep learning for weakly-supervised object detection and localization: a survey. Neurocomputing, 496, 192\u2013207.","journal-title":"Neurocomputing"},{"key":"37_CR4","first-page":"3059","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition","author":"P. Tang","year":"2017","unstructured":"Tang, P., Wang, X., Bai, X., & Liu, W. (2017). Multiple instance detection network with online instance classifier refinement. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3059\u20133067). Piscataway: IEEE."},{"key":"37_CR5","first-page":"370","volume-title":"Proceedings of the 15th European conference on computer vision","author":"P. Tang","year":"2018","unstructured":"Tang, P., Wang, X., Wang, A., Yan, Y., Liu, W., Huang, J., et al. (2018). Weakly supervised region proposal network and object detection. In V. Ferrari, M.\u00a0Hebert, C. Sminchisescu, et al. (Eds.), Proceedings of the 15th European conference on computer vision (pp. 370\u2013386). Cham: Springer."},{"key":"37_CR6","first-page":"1297","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision & pattern recognition","author":"W. Fang","year":"2018","unstructured":"Fang, W., Wei, P., Jiao, J., Han, Z., & Ye, Q. (2018). Min-entropy latent model for weakly supervised object detection. In Proceedings of the IEEE\/CVF conference on computer vision & pattern recognition (pp. 1297\u20131306). Piscataway: IEEE."},{"key":"37_CR7","first-page":"2194","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition","author":"F. Wan","year":"2020","unstructured":"Wan, F., Liu, C., Ke, W., Ji, X., Jiao, J., & Ye, Q. (2020). C-MIL: continuation multiple instance learning for weakly supervised object detection. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 2194\u20132203). Piscataway: IEEE."},{"key":"37_CR8","first-page":"4069","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition workshops","author":"K. Yang","year":"2020","unstructured":"Yang, K., Zhang, P., Qiao, P., Wang, Z., & Dou, Y. (2020). Rethinking segmentation guidance for weakly supervised object detection. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition workshops (pp. 4069\u20134073). Piscataway: IEEE."},{"key":"37_CR9","first-page":"10595","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition","author":"Z. Ren","year":"2020","unstructured":"Ren, Z., Yu, Z., Yang, X., Liu, M., Lee, Y. J., Schwing, A. G., et al. (2020). Instance-aware, context-focused, and memory-efficient weakly supervised object detection. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 10595\u201310604). Piscataway: IEEE."},{"issue":"1","key":"37_CR10","doi-asserted-by":"publisher","first-page":"176","DOI":"10.1109\/TPAMI.2018.2876304","volume":"42","author":"P. Tang","year":"2020","unstructured":"Tang, P., Wang, X., Bai, S., Shen, W., Bai, X., Liu, W., et al. (2020). PCL: proposal cluster learning for weakly supervised object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(1), 176\u2013191.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"37_CR11","first-page":"697","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition","author":"Y. Shen","year":"2019","unstructured":"Shen, Y., Ji, R., Wang, Y., Wu, Y., & Cao, L. (2019). Cyclic guidance for weakly supervised joint detection and segmentation. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 697\u2013707). Piscataway: IEEE."},{"issue":"8","key":"37_CR12","doi-asserted-by":"publisher","first-page":"10394","DOI":"10.1109\/TPAMI.2023.3243054","volume":"45","author":"L. Sui","year":"2023","unstructured":"Sui, L., Zhang, C., & Wu, J. (2023). Salvage of supervision in weakly supervised object detection and segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(8), 10394\u201310408.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"issue":"11","key":"37_CR13","doi-asserted-by":"crossref","first-page":"2561","DOI":"10.11834\/jig.200697","volume":"26","author":"W. Zhao","year":"2021","unstructured":"Zhao, W., Zhang, H., & Xu, M. (2021). Insulator recognition based on an improved scale-transferrable network. International Journal of Image and Graphics, 26(11), 2561\u20132570.","journal-title":"International Journal of Image and Graphics"},{"issue":"6","key":"37_CR14","first-page":"1098","volume":"16","author":"W. Zhao","year":"2021","unstructured":"Zhao, W., & Yang, P. (2021). Target detection based on bidirectional feature fusion and an attention mechanis. CAAI Transactions on Intelligent Systems, 16(6), 1098\u20131105.","journal-title":"CAAI Transactions on Intelligent Systems"},{"key":"37_CR15","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2023.120381","volume":"228","author":"C. K. Sunil","year":"2023","unstructured":"Sunil, C. K., Jaidhar, C. D., & Patil, N. (2023). Tomato plant disease classification using multilevel feature fusion with adaptive channel spatial and pixel attention mechanism. Expert Systems with Applications, 228, 120381.","journal-title":"Expert Systems with Applications"},{"key":"37_CR16","doi-asserted-by":"publisher","first-page":"21","DOI":"10.1016\/j.cag.2023.04.007","volume":"113","author":"X. Song","year":"2023","unstructured":"Song, X., Liu, W., Liang, L., Shi, W., Xie, G., Lu, X., et al. (2023). Image super-resolution with multi-scale fractal residual attention network. Computers & Graphics, 113, 21\u201331.","journal-title":"Computers & Graphics"},{"key":"37_CR17","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2023.120330","volume":"228","author":"J. Wang","year":"2023","unstructured":"Wang, J., Zhang, X., Jing, K., & Zhang, C. (2023). Learning precise feature via self-attention and self-cooperation yolox for smoke detection. Expert Systems with Applications, 228, 120330.","journal-title":"Expert Systems with Applications"},{"key":"37_CR18","first-page":"3544","volume-title":"Proceedings of the IEEE international conference on computer vision","author":"K. K. Singh","year":"2017","unstructured":"Singh, K. K., & Lee, Y. J. (2017). Hide-and-seek: forcing a network to be meticulous for weakly-supervised object and action localization. In Proceedings of the IEEE international conference on computer vision (pp. 3544\u20133553). Piscataway: IEEE."},{"key":"37_CR19","first-page":"2219","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition","author":"J. Choe","year":"2019","unstructured":"Choe, J., & Shim, H. (2019). Attention-based dropout layer for weakly supervised object localization. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 2219\u20132228). Piscataway: IEEE."},{"key":"37_CR20","first-page":"454","volume-title":"Proceedings of the 15th European conference on computer vision","author":"Y. Wei","year":"2018","unstructured":"Wei, Y., Shen, Z., Cheng, B., Shi, H., Xiong, J., Feng, J., et al. (2018). Ts2c: tight box mining with surrounding segmentation context for weakly supervised object detection. In V. Ferrari, M. Hebert, C. Sminchisescu, et al. (Eds.), Proceedings of the 15th European conference on computer vision (pp. 454\u2013470). Cham: Springer."},{"issue":"2","key":"37_CR21","doi-asserted-by":"publisher","first-page":"154","DOI":"10.1007\/s11263-013-0620-5","volume":"104","author":"J. R. R. Uijlings","year":"2013","unstructured":"Uijlings, J. R. R., van de Sande, K. E. A., Gevers, T., & Smeulders, A. W. M. (2013). Selective search for object recognition. International Journal of Computer Vision, 104(2), 154\u2013171.","journal-title":"International Journal of Computer Vision"},{"key":"37_CR22","first-page":"10750","volume-title":"Proceedings of the 32nd international conference on neural information processing systems","author":"G. Ghiasi","year":"2018","unstructured":"Ghiasi, G., Lin, T., & Le, Q. V. (2018). Dropblock: a regularization method for convolutional networks. In S. Bengio, H. M. Wallach, H. Larochelle, et al. (Eds.), Proceedings of the 32nd international conference on neural information processing systems. (pp. 10750\u201310760). Red Hook: Curran Associates."},{"key":"37_CR23","first-page":"2846","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition","author":"H. Bilen","year":"2016","unstructured":"Bilen, H., & Vedaldi, A. (2016). Weakly supervised deep detection networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2846\u20132854). Piscataway: IEEE."},{"key":"37_CR24","doi-asserted-by":"publisher","DOI":"10.1016\/j.imavis.2021.104314","volume":"116","author":"Z. Chen","year":"2021","unstructured":"Chen, Z., Fu, Z., Huang, J., Tao, M., Jiang, R., Tian, X., et al. (2021). Spatial likelihood voting with self-knowledge distillation for weakly supervised object detection. Image and Vision Computing, 116, 104314.","journal-title":"Image and Vision Computing"},{"key":"37_CR25","first-page":"16797","volume-title":"Proceedings of the 34th international conference on neural information processing systems","author":"Z. Huang","year":"2020","unstructured":"Huang, Z., Zou, Y., Kumar, B. V. K. V., & Huang, D. (2020). Comprehensive attention self-distillation for weakly-supervised object detection. In H. Larochelle, M. Ranzato, R. Hadsell, et al. (Eds.), Proceedings of the 34th international conference on neural information processing systems (pp. 16797\u201316807). Red Hook: Curran Associates."}],"container-title":["Visual Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s44267-024-00037-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s44267-024-00037-y\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s44267-024-00037-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,11,10]],"date-time":"2024-11-10T02:22:01Z","timestamp":1731205321000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s44267-024-00037-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,2,4]]},"references-count":25,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2024,12]]}},"alternative-id":["37"],"URL":"https:\/\/doi.org\/10.1007\/s44267-024-00037-y","relation":{},"ISSN":["2731-9008"],"issn-type":[{"value":"2731-9008","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,2,4]]},"assertion":[{"value":"28 June 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 January 2024","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"23 January 2024","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"4 February 2024","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"2"}}