{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,11]],"date-time":"2025-09-11T20:47:19Z","timestamp":1757623639352,"version":"3.44.0"},"reference-count":57,"publisher":"Springer Science and Business Media LLC","issue":"9","license":[{"start":{"date-parts":[[2025,5,26]],"date-time":"2025-05-26T00:00:00Z","timestamp":1748217600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,5,26]],"date-time":"2025-05-26T00:00:00Z","timestamp":1748217600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int J Comput Vis"],"published-print":{"date-parts":[[2025,9]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>Open-set object detection (OSOD), a task involving the detection of unknown objects while accurately detecting known objects, has recently gained attention. However, we identify a fundamental issue with the problem formulation employed in current OSOD studies. Inherent to object detection is knowing \u201cwhat to detect,\u201d which contradicts the idea of identifying \u201cunknown\u201d objects. This sets OSOD apart from open-set recognition (OSR). This contradiction complicates a proper evaluation of methods\u2019 performance, a fact that previous studies have overlooked. Next, we propose a novel formulation wherein detectors are required to detect both known and unknown classes within specified super-classes of object classes. This new formulation is free from the aforementioned issues and has practical applications. Finally, we design benchmark tests utilizing existing datasets and report the experimental evaluation of existing OSOD methods. The results show that existing methods fail to accurately detect unknown objects due to misclassification of known and unknown classes rather than incorrect bounding box prediction. As a byproduct, we introduce a taxonomy of OSOD, resolving confusion prevalent in the literature. We anticipate that our study will encourage the research community to reconsider OSOD and facilitate progress in the right direction.<\/jats:p>","DOI":"10.1007\/s11263-025-02479-3","type":"journal-article","created":{"date-parts":[[2025,5,26]],"date-time":"2025-05-26T12:53:39Z","timestamp":1748264019000},"page":"6145-6169","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Rethinking Open-Set Object Detection: Issues, A New Formulation, and Taxonomy"],"prefix":"10.1007","volume":"133","author":[{"given":"Yusuke","family":"Hosoya","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Masanori","family":"Suganuma","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Takayuki","family":"Okatani","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2025,5,26]]},"reference":[{"key":"2479_CR1","doi-asserted-by":"crossref","unstructured":"Bansal, A., Sikka. K., Sharma. G., et\u00a0al. (2018). Zero-Shot Object Detection. In: Proc. ECCV","DOI":"10.1007\/978-3-030-01246-5_24"},{"key":"2479_CR2","doi-asserted-by":"crossref","unstructured":"Bendale, A., Boult, T.E. (2016). Towards Open Set Deep Networks. In: Proc. CVPR","DOI":"10.1109\/CVPR.2016.173"},{"key":"2479_CR3","doi-asserted-by":"crossref","unstructured":"Carion, N., Massa, F., Synnaeve, G., et\u00a0al. (2020). End-to-End Object Detection with Transformers. In: Proc. ECCV","DOI":"10.1007\/978-3-030-58452-8_13"},{"key":"2479_CR4","unstructured":"Chen, K., Wang, J., Pang, J., et\u00a0al. (2019). MMDetection: Open MMLab Detection Toolbox and Benchmark. arXiv 1906.07155"},{"key":"2479_CR5","doi-asserted-by":"crossref","unstructured":"Deng, J., Dong, W., Socher, R., et\u00a0al. (2009). ImageNet: A Large-Scale Hierarchical Image Database. In: Proc. CVPR","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"2479_CR6","doi-asserted-by":"crossref","unstructured":"Dhamija, A., Gunther, M., Ventura, J., et\u00a0al. (2020). The Overlooked Elephant of Object Detection: Open Set. In: Proc. WACV","DOI":"10.1109\/WACV45572.2020.9093355"},{"key":"2479_CR7","unstructured":"Du, X., Wang, Z., Cai, M., et\u00a0al. (2022). VOS: Learning What You Don\u2019t Know by Virtual Outlier Synthesis. In: Proc. ICLR"},{"key":"2479_CR8","unstructured":"Ertler, C., Mislej, J., Ollmann, T., et\u00a0al. (2020). Traffic Sign Detection and Classification around the World. In: Proc. ECCV"},{"key":"2479_CR9","doi-asserted-by":"crossref","unstructured":"Everingham, M., V. Gool L, Williams CKI, et al. (2010). The Pascal Visual Object Classes (VOC) Challenge. IJCV, 88(2), 303\u2013338.","DOI":"10.1007\/s11263-009-0275-4"},{"key":"2479_CR10","doi-asserted-by":"crossref","unstructured":"Fellbaum, C. (1998). WordNet: An electronic lexical database. MIT press","DOI":"10.7551\/mitpress\/7287.001.0001"},{"issue":"9","key":"2479_CR11","doi-asserted-by":"publisher","first-page":"1627","DOI":"10.1109\/TPAMI.2009.167","volume":"32","author":"PF Felzenszwalb","year":"2010","unstructured":"Felzenszwalb, P. F., Girshick, R. B., McAllester, D., et al. (2010). Object Detection with Discriminatively Trained Part-Based Models. TPAMI, 32(9), 1627\u20131645.","journal-title":"TPAMI"},{"key":"2479_CR12","unstructured":"Gal, Y., & Ghahramani, Z. (2016). Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning. In: Proc. ICML"},{"key":"2479_CR13","doi-asserted-by":"crossref","unstructured":"Ge, Z., Demyanov, S., Garnavi, R. (2017). Generative OpenMax for Multi-Class Open Set Classification. In: Proc. BMVC","DOI":"10.5244\/C.31.42"},{"key":"2479_CR14","unstructured":"Gu, X., Lin, T.Y., Kuo, W., et\u00a0al. (2022). Open-vocabulary Object Detection via Vision and Language Knowledge Distillation. In: Proc. ICLR"},{"key":"2479_CR15","doi-asserted-by":"crossref","unstructured":"Gupta, A., Narayan, S., Joseph, K.J., et\u00a0al. (2022). OW-DETR: Open-World Detection Transformer. In: Proc. CVPR","DOI":"10.1109\/CVPR52688.2022.00902"},{"key":"2479_CR16","doi-asserted-by":"crossref","unstructured":"Han, J., Ren, Y., Ding, J., et\u00a0al. (2022). Expanding Low-Density Latent Regions for Open-Set Object Detection. In: Proc. CVPR","DOI":"10.1109\/CVPR52688.2022.00937"},{"key":"2479_CR17","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., et\u00a0al. (2016). Deep Residual Learning for Image Recognition. In: Proc. CVPR","DOI":"10.1109\/CVPR.2016.90"},{"key":"2479_CR18","doi-asserted-by":"crossref","unstructured":"Joseph, K.J., Khan, S., Khan, F.S., et\u00a0al. (2021). Towards Open World Object Detection. In: Proc. CVPR","DOI":"10.1109\/CVPR46437.2021.00577"},{"key":"2479_CR19","doi-asserted-by":"crossref","unstructured":"Kong, S., & Ramanan, D. (2021). OpenGAN: Open-Set Recognition via Open Data Generation. In: Proc. ICCV","DOI":"10.1109\/ICCV48922.2021.00085"},{"key":"2479_CR20","unstructured":"Kuznetsova, A., Rom, H., Alldrin, N., et\u00a0al. (2018). The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale. arXiv:1811.00982"},{"key":"2479_CR21","doi-asserted-by":"crossref","unstructured":"Lee, K., Lee, K., Min, K., et\u00a0al. (2018). Hierarchical Novelty Detection for Visual Object Recognition. In: Proc. CVPR","DOI":"10.1109\/CVPR.2018.00114"},{"key":"2479_CR22","doi-asserted-by":"crossref","unstructured":"Li, L.H., Zhang, P., Zhang, H., et\u00a0al. (2022). Grounded Language-Image Pre-training. In: Proc. CVPR","DOI":"10.1109\/CVPR52688.2022.01069"},{"key":"2479_CR23","doi-asserted-by":"crossref","unstructured":"Lin, T., Maire, M., Belongie, S.J., et\u00a0al. (2014). Microsoft COCO: Common Objects in Context. In: Proc. ECCV","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"2479_CR24","doi-asserted-by":"crossref","unstructured":"Lin, T.Y, Dollar, P., Girshick, R., et\u00a0al. (2017). Feature Pyramid Networks for Object Detection. In: Proc. CVPR","DOI":"10.1109\/CVPR.2017.106"},{"key":"2479_CR25","doi-asserted-by":"crossref","unstructured":"Liu, S., Zeng, Z., Ren, T., et\u00a0al. (2023). Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection. arXiv:2303.05499","DOI":"10.1007\/978-3-031-72970-6_3"},{"key":"2479_CR26","doi-asserted-by":"crossref","unstructured":"Lu, C., Krishna, R., Bernstein. M.S., et\u00a0al. (2016). Visual Relationship Detection with Language Priors. In: Proc. ECCV","DOI":"10.1007\/978-3-319-46448-0_51"},{"key":"2479_CR27","doi-asserted-by":"crossref","unstructured":"Miller, D., Nicholson, L., Dayoub, F., et\u00a0al. (2018). Dropout Sampling for Robust Object Detection in Open-Set Conditions. In: Proc. ICRA","DOI":"10.1109\/ICRA.2018.8460700"},{"key":"2479_CR28","doi-asserted-by":"crossref","unstructured":"Miller, D., Dayoub, F., Milford, M., et\u00a0al. (2019). Evaluating Merging Strategies for Sampling-based Uncertainty Techniques in Object Detection. In: Proc. ICRA","DOI":"10.1109\/ICRA.2019.8793821"},{"key":"2479_CR29","doi-asserted-by":"crossref","unstructured":"Minderer, M., Gritsenko, A., Stone, A., et\u00a0al. (2022). Simple Open-Vocabulary Object Detection with Vision Transformers. In: Proc. ECCV","DOI":"10.1007\/978-3-031-20080-9_42"},{"key":"2479_CR30","doi-asserted-by":"crossref","unstructured":"Neal, L., Olson, M., Fern, X., et\u00a0al. (2018). Open Set Learning with Counterfactual Images. In: Proc. ECCV","DOI":"10.1007\/978-3-030-01231-1_38"},{"key":"2479_CR31","doi-asserted-by":"crossref","unstructured":"Oza, P., & Patel, V.M. (2019). C2AE: Class Conditioned Auto-Encoder for Open-Set Recognition. In: Proc. CVPR","DOI":"10.1109\/CVPR.2019.00241"},{"key":"2479_CR32","unstructured":"Pyakurel, S., & Yu, Q. (2024). Hierarchical Novelty Detection via Fine-Grained Evidence Allocation. In: Proc. ICML"},{"key":"2479_CR33","unstructured":"Radford, A., Kim, J.W., Hallacy, C., et\u00a0al. (2021). Learning Transferable Visual Models From Natural Language Supervision. In: Proc. ICML"},{"key":"2479_CR34","unstructured":"Ren, S., He, K., Girshick, R., et\u00a0al. (2015). Faster R-CNN: Towards Real-time Object Detection with Region Proposal Networks. In: Proc. NeurIPS"},{"key":"2479_CR35","doi-asserted-by":"crossref","unstructured":"Scheirer, W. J., de R. Rocha A, Sapkota A, et al. (2013). Toward Open Set Recognition. TPAMI, 35(7), 1757\u20131772.","DOI":"10.1109\/TPAMI.2012.256"},{"key":"2479_CR36","doi-asserted-by":"crossref","unstructured":"Shao, S., Li, Z., Zhang, T., et\u00a0al. (2019). Objects365: A Large-Scale, High-Quality Dataset for Object Detection. In: Proc. ICCV","DOI":"10.1109\/ICCV.2019.00852"},{"issue":"8","key":"2479_CR37","first-page":"888","volume":"22","author":"J Shi","year":"2000","unstructured":"Shi, J., & Malik, J. (2000). Normalized Cuts and Image Segmentation. TPAMI, 22(8), 888\u2013905.","journal-title":"Normalized Cuts and Image Segmentation. TPAMI"},{"key":"2479_CR38","doi-asserted-by":"crossref","unstructured":"Shu, L., Xu. H., & Liu, B. (2017). DOC: Deep Open Classification of Text Documents. In: Proc. EMNLP","DOI":"10.18653\/v1\/D17-1314"},{"key":"2479_CR39","unstructured":"Singh, D.K., Rai, S.N., Joseph, K.J., et\u00a0al. (2021). ORDER: Open World Object Detection on Road Scenes. In: Proc. NeurIPS Workshops"},{"key":"2479_CR40","doi-asserted-by":"crossref","unstructured":"Sun, X., Yang, Z., Zhang, C., et\u00a0al. (2020). Conditional Gaussian Distribution Learning for Open Set Recognition. In: Proc. CVPR","DOI":"10.1109\/CVPR42600.2020.01349"},{"key":"2479_CR41","doi-asserted-by":"crossref","unstructured":"Sun, Z., Li, J., & Mu, Y. (2024). Exploring Orthogonality in Open World Object Detection. In: Proc. CVPR","DOI":"10.1109\/CVPR52733.2024.01638"},{"key":"2479_CR42","doi-asserted-by":"crossref","unstructured":"Tian, Z., Shen, C., Chen, H., et\u00a0al. (2019). FCOS: Fully Convolutional One-stage Object Detection. In: Proc. ICCV","DOI":"10.1109\/ICCV.2019.00972"},{"key":"2479_CR43","unstructured":"Vaze, S., Han, K., Vedaldi, A., et\u00a0al. (2022). Open-Set Recognition: A Good Closed-Set Classifier is All You Need. In: Proc. ICLR"},{"key":"2479_CR44","unstructured":"Wah, C., Branson, S., Welinder, P., et\u00a0al. (2011). Caltech-ucsd birds-200-2011. Tech. Rep. CNS-TR-2011-001, California Institute of Technology"},{"key":"2479_CR45","doi-asserted-by":"crossref","unstructured":"Wang, Y., Yue, Z., Hua, X.S., et\u00a0al. (2023). Random Boxes Are Open-world Object Detectors. In: Proc. ICCV","DOI":"10.1109\/ICCV51070.2023.00573"},{"key":"2479_CR46","doi-asserted-by":"crossref","unstructured":"Wu, X., Zhu, F., Zhao, R., et\u00a0al. (2023). CORA: Adapting CLIP for Open-Vocabulary Detection With Region Prompting and Anchor Pre-Matching. In: Proc. CVPR","DOI":"10.1109\/CVPR52729.2023.00679"},{"key":"2479_CR47","doi-asserted-by":"crossref","unstructured":"Wu, Z., Lu, Y., Chen, X., et\u00a0al. (2022). UC-OWOD: Unknown-Classified Open World Object Detection. In: Proc. ECCV","DOI":"10.1007\/978-3-031-20080-9_12"},{"key":"2479_CR48","unstructured":"Yao, L., Han, J., Wen, Y., et\u00a0al. (2022). DetCLIP: Dictionary-Enriched Visual-Concept Paralleled Pre-training for Open-world Detection. In: Proc. NeurIPS"},{"key":"2479_CR49","doi-asserted-by":"crossref","unstructured":"Yao, L., Han, J., Liang, X., et\u00a0al. (2023). DetCLIPv2: Scalable Open-Vocabulary Object Detection Pre-Training via Word-Region Alignment. In: Proc. CVPR","DOI":"10.1109\/CVPR52729.2023.02250"},{"key":"2479_CR50","doi-asserted-by":"crossref","unstructured":"Yoshihashi, R., Shao, W., Kawakami, R., et\u00a0al. (2019). Classification-Reconstruction Learning for Open-Set Recognition. In: Proc. CVPR","DOI":"10.1109\/CVPR.2019.00414"},{"key":"2479_CR51","doi-asserted-by":"crossref","unstructured":"Zareian, A., Rosa, K.D., Hu, D.H., et\u00a0al. (2021). Open-Vocabulary Object Detection Using Captions. In: Proc. CVPR","DOI":"10.1109\/CVPR46437.2021.01416"},{"issue":"8","key":"2479_CR52","first-page":"1690","volume":"39","author":"H Zhang","year":"2017","unstructured":"Zhang, H., & Patel, V. M. (2017). Sparse Representation-based Open Set Recognition. TPAMI, 39(8), 1690\u20131696.","journal-title":"Sparse Representation-based Open Set Recognition. TPAMI"},{"key":"2479_CR53","unstructured":"Zhao, X., Liu, X., Shen, Y., et\u00a0al. (2022). Revisiting Open World Object Detection. arXiv:2201.00471"},{"key":"2479_CR54","doi-asserted-by":"crossref","unstructured":"Zheng, J., Li, W., Hong, J., et\u00a0al. (2022). Towards Open-Set Object Detection and Discovery. In: Proc. CVPR Workshops","DOI":"10.1109\/CVPRW56347.2022.00441"},{"key":"2479_CR55","doi-asserted-by":"crossref","unstructured":"Zhou, D.W., Ye, H.J., & Zhan, D.C. (2021). Learning Placeholders for Open-Set Recognition. In: Proc. CVPR","DOI":"10.1109\/CVPR46437.2021.00438"},{"key":"2479_CR56","doi-asserted-by":"crossref","unstructured":"Zhou, X., Girdhar, R., Joulin, A., et\u00a0al. (2022). Detecting Twenty-thousand Classes using Image-level Supervision. In: Proc. ECCV","DOI":"10.1007\/978-3-031-20077-9_21"},{"key":"2479_CR57","unstructured":"Zhu, X., Su, W., Lu, L., et\u00a0al. (2021). Deformable DETR: Deformable Transformers for End-to-End Object Detection. In: Proc. ICLR"}],"container-title":["International Journal of Computer Vision"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11263-025-02479-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11263-025-02479-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11263-025-02479-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,9]],"date-time":"2025-09-09T08:03:29Z","timestamp":1757405009000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11263-025-02479-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,5,26]]},"references-count":57,"journal-issue":{"issue":"9","published-print":{"date-parts":[[2025,9]]}},"alternative-id":["2479"],"URL":"https:\/\/doi.org\/10.1007\/s11263-025-02479-3","relation":{},"ISSN":["0920-5691","1573-1405"],"issn-type":[{"type":"print","value":"0920-5691"},{"type":"electronic","value":"1573-1405"}],"subject":[],"published":{"date-parts":[[2025,5,26]]},"assertion":[{"value":"10 July 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 May 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"26 May 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}