{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,10]],"date-time":"2026-03-10T06:20:43Z","timestamp":1773123643962,"version":"3.50.1"},"reference-count":72,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2026,3,9]],"date-time":"2026-03-09T00:00:00Z","timestamp":1773014400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2026,3,9]],"date-time":"2026-03-09T00:00:00Z","timestamp":1773014400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100020870","name":"Office of the Vice-President for Research and Development, Hong Kong University of Science and Technology","doi-asserted-by":"publisher","award":["R9429"],"award-info":[{"award-number":["R9429"]}],"id":[{"id":"10.13039\/501100020870","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001445","name":"DSO National Laboratories - Singapore","doi-asserted-by":"publisher","award":["AISG2-GC-2023-008"],"award-info":[{"award-number":["AISG2-GC-2023-008"]}],"id":[{"id":"10.13039\/501100001445","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001348","name":"Agency for Science, Technology and Research","doi-asserted-by":"publisher","award":["C233312028"],"award-info":[{"award-number":["C233312028"]}],"id":[{"id":"10.13039\/501100001348","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100008629","name":"Info-communications Media Development Authority","doi-asserted-by":"publisher","award":["DTC-RGC-04"],"award-info":[{"award-number":["DTC-RGC-04"]}],"id":[{"id":"10.13039\/501100008629","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int J Comput Vis"],"published-print":{"date-parts":[[2026,4]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>\n                    We have been witnessing remarkable success led by the power of neural networks driven by a significant scale of training data in handling various computer vision tasks. However, less attention has been paid to monitoring the camouflaged animals, the masters of hiding themselves in the background. Robust and precise segmentation of camouflaged animals is challenging even for domain experts due to their similarity to the environment. Although several efforts have been made in camouflaged animal image segmentation, to the best of our knowledge, limited work exists on camouflaged animal video understanding (CAVU). Biologists often prefer videos for monitoring and understanding animal behaviors, as videos provide redundant information and temporal consistency. However, the scarcity of labeled video data significantly hinders progress in this area. To address these challenges, we present\n                    <jats:bold>CamoVid60K<\/jats:bold>\n                    , a diverse, large-scale, and accurately annotated video dataset of camouflaged animals. This dataset comprises\n                    <jats:bold>218<\/jats:bold>\n                    videos with\n                    <jats:bold>62,774<\/jats:bold>\n                    finely annotated frames, covering\n                    <jats:bold>70<\/jats:bold>\n                    animal categories, which\n                    <jats:italic>surpasses<\/jats:italic>\n                    all previous datasets in terms of the number of videos\/frames and species included.\n                    <jats:bold>CamoVid60K<\/jats:bold>\n                    also offers more diverse downstream tasks in computer vision, such as camouflaged animal classification, detection, and task-specific segmentation (semantic, referring, motion),\n                    <jats:italic>etc.<\/jats:italic>\n                    We have benchmarked several state-of-the-art algorithms on the proposed\n                    <jats:bold>CamoVid60K<\/jats:bold>\n                    dataset, and the experimental results provide valuable insights for future research directions. Our dataset serves as a novel and challenging benchmark to stimulate the development of more powerful camouflaged animal video segmentation algorithms, with substantial room for further improvement.\n                  <\/jats:p>","DOI":"10.1007\/s11263-026-02765-8","type":"journal-article","created":{"date-parts":[[2026,3,9]],"date-time":"2026-03-09T17:27:37Z","timestamp":1773077257000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["CamoVid60K: A Large-Scale Video Dataset for Moving Camouflaged Animals Understanding"],"prefix":"10.1007","volume":"134","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8872-0875","authenticated-orcid":false,"given":"Tuan-Anh","family":"Vu","sequence":"first","affiliation":[]},{"given":"Ziqiang","family":"Zheng","sequence":"additional","affiliation":[]},{"given":"Chengyang","family":"Song","sequence":"additional","affiliation":[]},{"given":"Qing","family":"Guo","sequence":"additional","affiliation":[]},{"given":"Ivor W.","family":"Tsang","sequence":"additional","affiliation":[]},{"given":"Sai-Kit","family":"Yeung","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2026,3,9]]},"reference":[{"key":"2765_CR1","doi-asserted-by":"crossref","unstructured":"Beery, S., Van\u00a0Horn, G., Perona, P.: Recognition in terra incognita. In: ECCV, pp. 456\u2013473 (2018)","DOI":"10.1007\/978-3-030-01270-0_28"},{"key":"2765_CR2","doi-asserted-by":"crossref","unstructured":"Butler, D.J., Wulff, J., Stanley, G.B., Black, M.J.: A naturalistic open source movie for optical flow evaluation. In: ECCV (2012)","DOI":"10.1007\/978-3-642-33783-3_44"},{"key":"2765_CR3","doi-asserted-by":"crossref","unstructured":"Caron, M., Touvron, H., Misra, I., J\u00e9gou, H., Mairal, J., Bojanowski, P., Joulin, A.: Emerging properties in self-supervised vision transformers. In: ICCV (2021)","DOI":"10.1109\/ICCV48922.2021.00951"},{"key":"2765_CR4","doi-asserted-by":"crossref","unstructured":"Cheng, B., Misra, I., Schwing, A.G., Kirillov, A., Girdhar, R.: Masked-attention mask transformer for universal image segmentation. In: CVPR, pp. 1290\u20131299 (2022)","DOI":"10.1109\/CVPR52688.2022.00135"},{"key":"2765_CR5","doi-asserted-by":"crossref","unstructured":"Cheng, H.K., Schwing, A.G.: XMem: Long-term video object segmentation with an atkinson-shiffrin memory model. In: ECCV (2022)","DOI":"10.1007\/978-3-031-19815-1_37"},{"key":"2765_CR6","doi-asserted-by":"crossref","unstructured":"Cheng, X., Xiong, H., Fan, D.-P., Zhong, Y., Harandi, M., Drummond, T., Ge, Z.: Implicit motion handling for video camouflaged object detection. In: CVPR (2022)","DOI":"10.1109\/CVPR52688.2022.01349"},{"key":"2765_CR7","doi-asserted-by":"crossref","unstructured":"Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: CVPR, pp. 248\u2013255 (2009)","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"2765_CR8","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)"},{"key":"2765_CR9","doi-asserted-by":"crossref","unstructured":"Dosovitskiy, A., Fischer, P., Ilg, E., H\u00e4usser, P., Haz\u0131rba\u015f, C., Golkov, V., Smagt, P., Cremers, D., Brox, T.: FlowNet: Learning optical flow with convolutional networks. In: ICCV (2015)","DOI":"10.1109\/ICCV.2015.316"},{"key":"2765_CR10","doi-asserted-by":"crossref","unstructured":"Fan, D.-P., Cheng, M.-M., Liu, Y., Li, T., Borji, A.: Structure-measure: A New Way to Evaluate Foreground Maps. In: ICCV (2017)","DOI":"10.1109\/ICCV.2017.487"},{"key":"2765_CR11","doi-asserted-by":"crossref","unstructured":"Fan, D.-P., Ji, G.-P., Xu, P., Cheng, M.-M., Sakaridis, C., Van\u00a0Gool, L.: Advances in deep concealed scene understanding. Visual Intelligence (2023)","DOI":"10.1007\/s44267-023-00019-6"},{"key":"2765_CR12","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/s44267-023-00019-6","volume":"1","author":"D-P Fan","year":"2023","unstructured":"Fan, D.-P., Ji, G.-P., Xu, P., Cheng, M.-M., Sakaridis, C., & Van Gool, L. (2023). Advances in deep concealed scene understanding. Visual Intelligence, 1, 1\u201316.","journal-title":"Visual Intelligence"},{"key":"2765_CR13","doi-asserted-by":"crossref","unstructured":"Gebru, T., Morgenstern, J., Vecchione, B., Vaughan, J.W., Wallach, H., Iii, H.D., Crawford, K.: Datasheets for datasets. Communications of the ACM 64(12), 86\u201392 (2021)","DOI":"10.1145\/3458723"},{"issue":"12","key":"2765_CR14","doi-asserted-by":"publisher","first-page":"86","DOI":"10.1145\/3458723","volume":"64","author":"T Gebru","year":"2021","unstructured":"Gebru, T., Morgenstern, J., Vecchione, B., Vaughan, J. W., Wallach, H., Iii, H. D., & Crawford, K. (2021). Datasheets for datasets. Communications of the ACM, 64(12), 86\u201392.","journal-title":"Communications of the ACM"},{"key":"2765_CR15","doi-asserted-by":"crossref","unstructured":"He, R., Dong, Q., Lin, J., Lau, R.W.: Weakly-supervised camouflaged object detection with scribble annotations. In: AAAI (2023)","DOI":"10.1609\/aaai.v37i1.25156"},{"key":"2765_CR16","doi-asserted-by":"crossref","unstructured":"He, C., Li, K., Zhang, Y., Tang, L., Zhang, Y., Guo, Z., Li, X.: Camouflaged object detection with feature decomposition and edge reconstruction. In: CVPR (2023)","DOI":"10.1109\/CVPR52729.2023.02111"},{"key":"2765_CR17","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770\u2013778 (2016)","DOI":"10.1109\/CVPR.2016.90"},{"key":"2765_CR18","doi-asserted-by":"crossref","unstructured":"Ji, G.-P., Fan, D.-P., Chou, Y.-C., Dai, D., Liniger, A., Van\u00a0Gool, L.: Deep gradient learning for efficient camouflaged object detection. MIR (2023)","DOI":"10.1007\/s11633-022-1365-9"},{"key":"2765_CR19","doi-asserted-by":"crossref","unstructured":"Ji, P., Zhong, Y., Li, H., Salzmann, M.: Null space clustering with applications to motion segmentation and face clustering. In: ICIP, pp. 283\u2013287 (2014)","DOI":"10.1109\/ICIP.2014.7025056"},{"key":"2765_CR20","doi-asserted-by":"publisher","first-page":"92","DOI":"10.1007\/s11633-022-1365-9","volume":"20","author":"G-P Ji","year":"2023","unstructured":"Ji, G.-P., Fan, D.-P., Chou, Y.-C., Dai, D., Liniger, A., & Van Gool, L. (2023). Deep gradient learning for efficient camouflaged object detection. Machine Intelligence Research, 20, 92\u2013108.","journal-title":"Machine Intelligence Research"},{"key":"2765_CR21","doi-asserted-by":"crossref","unstructured":"Kowal, M., Siam, M., Islam, M.A., Bruce, N.D., Wildes, R.P., Derpanis, K.G.: A deeper dive into what deep spatiotemporal networks encode: Quantifying static vs. dynamic information. In: CVPR (2022)","DOI":"10.1109\/CVPR52688.2022.01361"},{"key":"2765_CR22","doi-asserted-by":"crossref","unstructured":"Lamdouar, H., Xie, W., Zisserman, A.: Segmenting invisible moving objects. In: BMVC (2021)","DOI":"10.5244\/C.35.13"},{"key":"2765_CR23","doi-asserted-by":"crossref","unstructured":"Lamdouar, H., Xie, W., Zisserman, A.: The making and breaking of camouflage. In: ICCV, pp. 832\u2013842 (2023)","DOI":"10.1109\/ICCV51070.2023.00083"},{"key":"2765_CR24","doi-asserted-by":"crossref","unstructured":"Lamdouar, H., Yang, C., Xie, W., Zisserman, A.: Betrayed by motion: Camouflaged object discovery via motion segmentation. ACCV (2020)","DOI":"10.1007\/978-3-030-69532-3_30"},{"key":"2765_CR25","doi-asserted-by":"crossref","unstructured":"Le, T.-N., Nguyen, T.V., Nie, Z., Tran, M.-T., Sugimoto, A.: Anabranch network for camouflaged object segmentation. CVIU (2019)","DOI":"10.1016\/j.cviu.2019.04.006"},{"key":"2765_CR26","doi-asserted-by":"publisher","first-page":"287","DOI":"10.1109\/TIP.2021.3130490","volume":"31","author":"T-N Le","year":"2021","unstructured":"Le, T.-N., Cao, Y., Nguyen, T.-C., Le, M.-Q., Nguyen, K.-D., Do, T.-T., Tran, M.-T., & Nguyen, T. V. (2021). Camouflaged instance segmentation in-the-wild: Dataset, method, and benchmark suite. IEEE Transactions on Image Processing, 31, 287\u2013300.","journal-title":"IEEE Transactions on Image Processing"},{"key":"2765_CR27","doi-asserted-by":"crossref","unstructured":"Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., Zitnick, C.L.: Microsoft COCO: Common objects in context. In: ECCV, pp. 740\u2013755 (2014). Springer","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"2765_CR28","doi-asserted-by":"crossref","unstructured":"Lv, Y., Zhang, J., Dai, Y., Li, A., Liu, B., Barnes, N., Fan, D.-P.: Simultaneously localize, segment and rank the camouflaged objects. In: CVPR (2021)","DOI":"10.1109\/CVPR46437.2021.01142"},{"key":"2765_CR29","doi-asserted-by":"crossref","unstructured":"Margolin, R., Zelnik-Manor, L., Tal, A.: How to evaluate foreground maps? In: CVPR (2014)","DOI":"10.1109\/CVPR.2014.39"},{"key":"2765_CR30","doi-asserted-by":"crossref","unstructured":"Mayer, N., Ilg, E., H\u00e4usser, P., Fischer, P., Cremers, D., Dosovitskiy, A., Brox, T.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (2016)","DOI":"10.1109\/CVPR.2016.438"},{"key":"2765_CR31","doi-asserted-by":"crossref","unstructured":"Menze, M., Geiger, A.: Object scene flow for autonomous vehicles. In: CVPR (2015)","DOI":"10.1109\/CVPR.2015.7298925"},{"key":"2765_CR32","doi-asserted-by":"crossref","unstructured":"Meunier, E., Badoual, A., Bouthemy, P.: Em-driven unsupervised learning for efficient motion segmentation. IEEE T-PAMI (2022)","DOI":"10.1109\/TPAMI.2022.3198480"},{"key":"2765_CR33","first-page":"4462","volume":"45","author":"E Meunier","year":"2022","unstructured":"Meunier, E., Badoual, A., & Bouthemy, P. (2022). Em-driven unsupervised learning for efficient motion segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45, 4462\u20134473.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"2765_CR34","doi-asserted-by":"crossref","unstructured":"Norouzzadeh, M.S., Nguyen, A., Kosmala, M., Swanson, A., Palmer, M.S., Packer, C., Clune, J.: Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning. PNAS 115(25), 5716\u20135725 (2018)","DOI":"10.1073\/pnas.1719367115"},{"issue":"25","key":"2765_CR35","doi-asserted-by":"publisher","first-page":"5716","DOI":"10.1073\/pnas.1719367115","volume":"115","author":"MS Norouzzadeh","year":"2018","unstructured":"Norouzzadeh, M. S., Nguyen, A., Kosmala, M., Swanson, A., Palmer, M. S., Packer, C., & Clune, J. (2018). Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning. PNAS, 115(25), 5716\u20135725.","journal-title":"PNAS"},{"key":"2765_CR36","unstructured":"OpenAI, t.: GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023)"},{"key":"2765_CR37","unstructured":"Oquab, M., Darcet, T., Moutakanni, T., Vo, H.V., Szafraniec, M., Khalidov, V., Fernandez, P., HAZIZA, D., Massa, F., El-Nouby, A., Assran, M., Ballas, N., Galuba, W., Howes, R., Huang, P.-Y., Li, S.-W., Misra, I., Rabbat, M., Sharma, V., Synnaeve, G., Xu, H., Jegou, H., Mairal, J., Labatut, P., Joulin, A., Bojanowski, P.: DINOv2: Learning robust visual features without supervision. TMLR (2024)"},{"key":"2765_CR38","doi-asserted-by":"crossref","unstructured":"Pei, J., Cheng, T., Fan, D.-P., Tang, H., Chen, C., Van\u00a0Gool, L.: Osformer: One-stage camouflaged instance segmentation with transformers. In: ECCV (2022)","DOI":"10.1007\/978-3-031-19797-0_2"},{"key":"2765_CR39","doi-asserted-by":"crossref","unstructured":"Pia\u00a0Bideau, E.L.-M.: It\u2019s moving! a probabilistic model for causal motion segmentation in moving camera videos. In: ECCV (2016)","DOI":"10.1007\/978-3-319-46484-8_26"},{"key":"2765_CR40","unstructured":"Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., Sutskever, I.: Learning transferable visual models from natural language supervision. In: ICML, pp. 8748\u20138763 (2021)"},{"key":"2765_CR41","doi-asserted-by":"crossref","unstructured":"Rands, M.R., Adams, W.M., Bennun, L., Butchart, S.H., Clements, A., Coomes, D., Entwistle, A., Hodge, I., Kapos, V., Scharlemann, J.P., and others: Biodiversity conservation: challenges beyond 2010. Science 329(5997), 1298\u20131303 (2010)","DOI":"10.1126\/science.1189138"},{"key":"2765_CR42","unstructured":"Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: Towards real-time object detection with region proposal networks. NeurIPS 28 (2015)"},{"key":"2765_CR43","doi-asserted-by":"crossref","unstructured":"Seo, S., Lee, J.-Y., Han, B.: Urvos: Unified referring video object segmentation network with a large-scale benchmark. In: ECCV, pp. 208\u2013223 (2020). Springer","DOI":"10.1007\/978-3-030-58555-6_13"},{"key":"2765_CR44","doi-asserted-by":"crossref","unstructured":"Shao, S., Li, Z., Zhang, T., Peng, C., Yu, G., Zhang, X., Li, J., Sun, J.: Objects365: A large-scale, high-quality dataset for object detection. In: ICCV, pp. 8430\u20138439 (2019)","DOI":"10.1109\/ICCV.2019.00852"},{"key":"2765_CR45","doi-asserted-by":"crossref","unstructured":"Shen, S., Li, C., Hu, X., Xie, Y., Yang, J., Zhang, P., Gan, Z., Wang, L., Yuan, L., Liu, C., and others: K-lite: Learning transferable visual models with external knowledge. NeurIPS 35, 15558\u201315573 (2022)","DOI":"10.52202\/068431-1132"},{"key":"2765_CR46","doi-asserted-by":"crossref","unstructured":"Sim\u00f5es, F., Bouveyron, C., Precioso, F.: Deepwild: Wildlife identification, localisation and estimation on camera trap videos using deep learning. Ecological Informatics 75, 102095 (2023)","DOI":"10.1016\/j.ecoinf.2023.102095"},{"key":"2765_CR47","doi-asserted-by":"publisher","DOI":"10.1016\/j.ecoinf.2023.102095","volume":"75","author":"F Sim\u00f5es","year":"2023","unstructured":"Sim\u00f5es, F., Bouveyron, C., & Precioso, F. (2023). Deepwild: Wildlife identification, localisation and estimation on camera trap videos using deep learning. Ecological Informatics, 75, Article 102095.","journal-title":"Ecological Informatics"},{"key":"2765_CR48","doi-asserted-by":"crossref","unstructured":"Soofi, M., Sharma, S., Safaei-Mahroo, B., Sohrabi, M., Organli, M.G., Waltert, M.: Lichens and animal camouflage: some observations from central asian ecoregions. Journal of Threatened Taxa 14(2), 20672\u201320676 (2022)","DOI":"10.11609\/jott.7558.14.2.20672-20676"},{"issue":"2","key":"2765_CR49","doi-asserted-by":"publisher","first-page":"20672","DOI":"10.11609\/jott.7558.14.2.20672-20676","volume":"14","author":"M Soofi","year":"2022","unstructured":"Soofi, M., Sharma, S., Safaei-Mahroo, B., Sohrabi, M., Organli, M. G., & Waltert, M. (2022). Lichens and animal camouflage: some observations from central asian ecoregions. Journal of Threatened Taxa, 14(2), 20672\u201320676.","journal-title":"Journal of Threatened Taxa"},{"key":"2765_CR50","doi-asserted-by":"crossref","unstructured":"Sun, G., An, Z., Liu, Y., Liu, C., Sakaridis, C., Fan, D.-P., Van\u00a0Gool, L.: Indiscernible object counting in underwater scenes. In: CVPR (2023)","DOI":"10.1109\/CVPR52729.2023.01325"},{"key":"2765_CR51","doi-asserted-by":"crossref","unstructured":"Teed, Z., Deng, J.: RAFT: Recurrent all-pairs field transforms for optical flow. In: ECCV (2020)","DOI":"10.1007\/978-3-030-58536-5_24"},{"key":"2765_CR52","doi-asserted-by":"crossref","unstructured":"Troscianko, J., Skelhorn, J., Stevens, M.: Quantifying camouflage: how to predict detectability from appearance. BMC Evolutionary Biology 17, 1\u201313 (2017)","DOI":"10.1186\/s12862-016-0854-2"},{"key":"2765_CR53","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s12862-016-0854-2","volume":"17","author":"J Troscianko","year":"2017","unstructured":"Troscianko, J., Skelhorn, J., & Stevens, M. (2017). Quantifying camouflage: how to predict detectability from appearance. BMC Evolutionary Biology, 17, 1\u201313.","journal-title":"BMC Evolutionary Biology"},{"key":"2765_CR54","doi-asserted-by":"crossref","unstructured":"Truong, Q.-T., Vu, T.-A., Ha, T.-S., Loko\u010d, J., Tim, Y.H.W., Joneja, A., Yeung, S.-K.: Marine Video Kit: A new marine video dataset for content-based analysis and retrieval. In: MMM. Springer, ??? (2023)","DOI":"10.1007\/978-3-031-27077-2_42"},{"key":"2765_CR55","unstructured":"Vu, T.-A., Nguyen, D.T., Guo, Q., Hua, B.-S., Chung, N.M., Tsang, I.W., Yeung, S.-K.: Leveraging open-vocabulary diffusion to camouflaged instance segmentation. arXiv preprint arXiv:2312.17505 (2023)"},{"key":"2765_CR56","doi-asserted-by":"crossref","unstructured":"Wang, Q., Chang, Y.-Y., Cai, R., Li, Z., Hariharan, B., Holynski, A., Snavely, N.: Tracking everything everywhere all at once. In: ICCV (2023)","DOI":"10.1109\/ICCV51070.2023.01813"},{"issue":"4","key":"2765_CR57","doi-asserted-by":"publisher","first-page":"899","DOI":"10.1007\/s11263-022-01732-3","volume":"131","author":"F Wang","year":"2022","unstructured":"Wang, F., Cao, P., Li, F., Wang, X., He, B., & Sun, F. (2022). Watb: Wild animal tracking benchmark. International Journal of Computer Vision, 131(4), 899\u2013917. https:\/\/doi.org\/10.1007\/s11263-022-01732-3","journal-title":"International Journal of Computer Vision"},{"key":"2765_CR58","doi-asserted-by":"crossref","unstructured":"Xie, C., Xiang, Y., Harchaoui, Z., Fox, D.: Object discovery in videos as foreground motion clustering. In: CVPR (2019)","DOI":"10.1109\/CVPR.2019.01023"},{"key":"2765_CR59","doi-asserted-by":"crossref","unstructured":"Xie, J., Xie, W., Zisserman, A.: Segmenting moving objects via an object-centric layered representation. NeurIPS (2022)","DOI":"10.52202\/068431-2032"},{"key":"2765_CR60","doi-asserted-by":"publisher","first-page":"28023","DOI":"10.52202\/068431-2032","volume":"35","author":"J Xie","year":"2022","unstructured":"Xie, J., Xie, W., & Zisserman, A. (2022). Segmenting moving objects via an object-centric layered representation. Advances in neural information processing systems, 35, 28023\u201328036.","journal-title":"Advances in neural information processing systems"},{"key":"2765_CR61","doi-asserted-by":"crossref","unstructured":"Yang, C., Lamdouar, H., Lu, E., Zisserman, A., Xie, W.: Self-supervised video object segmentation by motion grouping. In: ICCV (2021)","DOI":"10.1109\/ICCV48922.2021.00709"},{"key":"2765_CR62","doi-asserted-by":"crossref","unstructured":"Yang, J., Li, C., Zhang, P., Xiao, B., Liu, C., Yuan, L., Gao, J.: Unified contrastive learning in image-text-label space. In: CVPR, pp. 19163\u201319173 (2022)","DOI":"10.1109\/CVPR52688.2022.01857"},{"key":"2765_CR63","unstructured":"Yang, Z., Wang, J., Ye, X., Tang, Y., Chen, K., Zhao, H., Torr, P.S.: Language-aware vision transformer for referring segmentation. IEEE T-PAMI 00(00), 1\u201318 (2024)"},{"key":"2765_CR64","doi-asserted-by":"crossref","unstructured":"Zeng, Y., Zhang, P., Zhang, J., Lin, Z., Lu, H.: Towards high-resolution salient object detection. In: ICCV, pp. 7234\u20137243 (2019)","DOI":"10.1109\/ICCV.2019.00733"},{"key":"2765_CR65","doi-asserted-by":"publisher","unstructured":"Zhang, L., Gao, J., Xiao, Z., Fan, H.: Animaltrack: A benchmark for multi-animal tracking in the wild. International Journal of Computer Vision 131(2), 496\u2013513 (2022) https:\/\/doi.org\/10.1007\/s11263-022-01711-8","DOI":"10.1007\/s11263-022-01711-8"},{"key":"2765_CR66","unstructured":"Zhang, H., Li, F., Liu, S., Zhang, L., Su, H., Zhu, J., Ni, L., Shum, H.-Y.: DINO: DETR with improved denoising anchor boxes for end-to-end object detection. In: ICLR (2023)"},{"issue":"2","key":"2765_CR67","doi-asserted-by":"publisher","first-page":"496","DOI":"10.1007\/s11263-022-01711-8","volume":"131","author":"L Zhang","year":"2022","unstructured":"Zhang, L., Gao, J., Xiao, Z., & Fan, H. (2022). Animaltrack: A benchmark for multi-animal tracking in the wild. International Journal of Computer Vision, 131(2), 496\u2013513. https:\/\/doi.org\/10.1007\/s11263-022-01711-8","journal-title":"International Journal of Computer Vision"},{"key":"2765_CR68","doi-asserted-by":"crossref","unstructured":"Zhang, X., Xiao, T., Ji, G.-P., Wu, X., Fu, K., & Zhao, Q. (2025). Explicit motion handling and interactive prompting for video camouflaged object detection. IEEE Transactions on Image Processing,34, 2853\u20132866.","DOI":"10.1109\/TIP.2025.3565879"},{"key":"2765_CR69","doi-asserted-by":"crossref","unstructured":"Zhao, Y., Lv, W., Xu, S., Wei, J., Wang, G., Dang, Q., Liu, Y., Chen, J.: Detrs beat yolos on real-time object detection. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 16965\u201316974 (2024)","DOI":"10.1109\/CVPR52733.2024.01605"},{"key":"2765_CR70","unstructured":"Zheng, Z., Xie, Y., Liang, H., Yu, Z., Yeung, S.-K.: CoralVOS: Dataset and benchmark for coral video segmentation. arXiv preprint arXiv:2310.01946 (2023)"},{"key":"2765_CR71","unstructured":"Zheng, Z., Zhang, J., Vu, T.-A., Diao, S., Tim, Y.H.W., Yeung, S.-K.: MarineGPT: Unlocking secrets of ocean to the public. arXiv preprint arXiv:2310.13596 (2023)"},{"key":"2765_CR72","doi-asserted-by":"crossref","unstructured":"Zhou, B., Zhao, H., Puig, X., Fidler, S., Barriuso, A., Torralba, A.: Scene parsing through ade20k dataset. In: CVPR, pp. 633\u2013641 (2017)","DOI":"10.1109\/CVPR.2017.544"}],"container-title":["International Journal of Computer Vision"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11263-026-02765-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11263-026-02765-8","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11263-026-02765-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,9]],"date-time":"2026-03-09T17:27:48Z","timestamp":1773077268000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11263-026-02765-8"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,3,9]]},"references-count":72,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2026,4]]}},"alternative-id":["2765"],"URL":"https:\/\/doi.org\/10.1007\/s11263-026-02765-8","relation":{},"ISSN":["0920-5691","1573-1405"],"issn-type":[{"value":"0920-5691","type":"print"},{"value":"1573-1405","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,3,9]]},"assertion":[{"value":"16 October 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"24 January 2026","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 March 2026","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"176"}}