{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,14]],"date-time":"2026-03-14T19:37:54Z","timestamp":1773517074538,"version":"3.50.1"},"reference-count":83,"publisher":"Association for Computing Machinery (ACM)","issue":"4","funder":[{"name":"RIE2025 Career Development Fund","award":["Award C233312009"],"award-info":[{"award-number":["Award C233312009"]}]},{"DOI":"10.13039\/501100001348","name":"Agency for Science, Technology and Research","doi-asserted-by":"crossref","award":["A*STAR"],"award-info":[{"award-number":["A*STAR"]}],"id":[{"id":"10.13039\/501100001348","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2026,4,30]]},"abstract":"<jats:p>\n                    Source-free domain adaptive semantic segmentation aims at adapting a model trained on the source domain to the target domain without requiring access to the source data. Self-training has emerged as a leading approach to address this challenging problem. However, without robust denoising mechanisms to reduce the noise in pseudo labels, it still easily fall into biased estimates. Most existing methods address this issue by introducing novel architectures, but often at the cost of increased model complexity or reliance on additional input modalities. Different from previous studies, this article introduces\n                    <jats:italic toggle=\"yes\">UniSFDA<\/jats:italic>\n                    , a unified multi-stage self-training framework that integrates cross-model transfer learning, uncertainty-aware pseudo label fusion, and intra-domain style augmentation, thereby enhancing both segmentation accuracy and test-time efficiency. Our proposed framework offers exceptional flexibility, with each component being independent and ready to be integrated into any existing self-training framework. Additionally, we investigate the performance of various representative segmentation models, including DeepLabv2, SegFormer, DFormer, and ViT-Adapter, within our framework. It is worth noting that\n                    <jats:italic toggle=\"yes\">UniSFDA<\/jats:italic>\n                    is model-agnostic, allowing both source and target networks to be instantiated with arbitrary segmentation architectures, and thus readily benefiting from future advances in segmentation models. Experiments on the GTA5\n                    <jats:inline-formula content-type=\"math\/tex\">\n                      <jats:tex-math notation=\"LaTeX\" version=\"MathJax\">\\(\\rightarrow\\)<\/jats:tex-math>\n                    <\/jats:inline-formula>\n                    Cityscapes and SYNTHIA\n                    <jats:inline-formula content-type=\"math\/tex\">\n                      <jats:tex-math notation=\"LaTeX\" version=\"MathJax\">\\(\\rightarrow\\)<\/jats:tex-math>\n                    <\/jats:inline-formula>\n                    Cityscapes benchmarks demonstrate the effectiveness of our framework. With DeepLabv2 (SegFormer) as the source model,\n                    <jats:italic toggle=\"yes\">UniSFDA<\/jats:italic>\n                    establishes new state-of-the-art performance, achieving mIoU scores of 61.8% (65.4%) and 57.9% (59.6%) on the two benchmarks, respectively.\n                  <\/jats:p>","DOI":"10.1145\/3795886","type":"journal-article","created":{"date-parts":[[2026,2,7]],"date-time":"2026-02-07T11:39:56Z","timestamp":1770464396000},"page":"1-20","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Improving Test-Time Efficiency in Source-Free Semantic Segmentation via Multi-Stage Self-Training"],"prefix":"10.1145","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6525-6133","authenticated-orcid":false,"given":"Yifang","family":"Yin","sequence":"first","affiliation":[{"name":"Institute for Infocomm Research (IR), A*STAR, Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8614-7366","authenticated-orcid":false,"given":"Jinming","family":"Cao","sequence":"additional","affiliation":[{"name":"National University of Singapore, Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7981-9873","authenticated-orcid":false,"given":"Zhenguang","family":"Liu","sequence":"additional","affiliation":[{"name":"Zhejiang University, Hangzhou, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4224-1449","authenticated-orcid":false,"given":"Guanfeng","family":"Wang","sequence":"additional","affiliation":[{"name":"Grabtaxi Holdings Limited, Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6598-2904","authenticated-orcid":false,"given":"Shili","family":"Xiang","sequence":"additional","affiliation":[{"name":"Institute for Infocomm Research (IR), A*STAR, Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7410-2590","authenticated-orcid":false,"given":"Roger","family":"Zimmermann","sequence":"additional","affiliation":[{"name":"National University of Singapore, Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2026,3,9]]},"reference":[{"key":"e_1_3_2_2_2","unstructured":"Ibrahim Batuhan Akkaya and Ugur Halici. 2022. Self-training via metric learning for source-free domain adaptation of semantic segmentation. arXiv:2212.04227. Retrieved from https:\/\/arxiv.org\/abs\/2212.04227"},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.01513"},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-59710-8_48"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52733.2024.00300"},{"key":"e_1_3_2_6_2","unstructured":"Yihong Cao Hui Zhang Xiao Lu Zheng Xiao Kailun Yang and Yaonan Wang. 2023. Towards source-free domain adaptive semantic segmentation via importance-aware and prototype-contrast learning. arXiv:2306.01598. Retrieved from https:\/\/arxiv.org\/abs\/2306.01598"},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2017.2699184"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.cviu.2024.104091"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1145\/3664647.3681041"},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.1145\/3581783.3611708"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1145\/3460940"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.220"},{"key":"e_1_3_2_13_2","volume-title":"ICLR","author":"Chen Zhe","year":"2023","unstructured":"Zhe Chen, Yuchen Duan, Wenhai Wang, Junjun He, Tong Lu, Jifeng Dai, and Yu Qiao. 2023. Vision transformer adapter for dense predictions. In ICLR."},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.350"},{"key":"e_1_3_2_15_2","volume-title":"ICLR","author":"Dosovitskiy Alexey","year":"2020","unstructured":"Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. In ICLR."},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00107"},{"key":"e_1_3_2_17_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV51070.2023.01523"},{"key":"e_1_3_2_18_2","first-page":"1050","volume-title":"International Conference on Machine Learning","author":"Gal Yarin","year":"2016","unstructured":"Yarin Gal and Zoubin Ghahramani. 2016. Dropout as a Bayesian approximation: Representing model uncertainty in deep learning. In International Conference on Machine Learning. PMLR, 1050\u20131059."},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.1145\/3422622"},{"key":"e_1_3_2_20_2","unstructured":"Adam Goodge Wee Siong Ng Bryan Hooi and See Kiong Ng. 2025. Spatio-temporal foundation models: Vision challenges and opportunities. arXiv:2501.09045. Retrieved from https:\/\/arxiv.org\/abs\/2501.09045"},{"key":"e_1_3_2_21_2","first-page":"1321","volume-title":"ICML","author":"Guo Chuan","year":"2017","unstructured":"Chuan Guo, Geoff Pleiss, Yu Sun, and Kilian Q. Weinberger. 2017. On calibration of modern neural networks. In ICML. PMLR, 1321\u20131330."},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_2_23_2","first-page":"1989","volume-title":"ICML","author":"Hoffman Judy","year":"2018","unstructured":"Judy Hoffman, Eric Tzeng, Taesung Park, Jun-Yan Zhu, Phillip Isola, Kate Saenko, Alexei Efros, and Trevor Darrell. 2018. CyCADA: Cycle-consistent adversarial domain adaptation. In ICML, 1989\u20131998."},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.00969"},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-20056-4_22"},{"key":"e_1_3_2_26_2","volume-title":"IEEE TPAMI","author":"Hoyer Lukas","year":"2023","unstructured":"Lukas Hoyer, Dengxin Dai, and Luc Van Gool. 2023. Domain adaptive and generalizable network architectures and training strategies for semantic image segmentation. In IEEE TPAMI."},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1145\/3635153"},{"key":"e_1_3_2_28_2","first-page":"3635","article-title":"Model adaptation: Historical contrastive learning for unsupervised domain adaptation without source data","volume":"34","author":"Huang Jiaxing","year":"2021","unstructured":"Jiaxing Huang, Dayan Guan, Aoran Xiao, and Shijian Lu. 2021. Model adaptation: Historical contrastive learning for unsupervised domain adaptation without source data. In Advances in Neural Information Processing System, Vol. 34, 3635\u20133649.","journal-title":"Advances in Neural Information Processing System"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.00299"},{"key":"e_1_3_2_30_2","unstructured":"Alexander B. Jung Kentaro Wada Jon Crall Satoshi Tanaka Jake Graving Christoph Reinders Sarthak Yadav Joy Banerjee G\u00e1bor Vecsei Adam Kraft et al. 2020. imgaug. Retrieved February 1 2020 from https:\/\/github.com\/aleju\/imgaug"},{"key":"e_1_3_2_31_2","unstructured":"Patrick Kage Jay C. Rothenberger Pavlos Andreadis and Dimitrios I. Diochnos. 2024. A review of pseudo-labeling for computer vision. arXiv:2408.07221. Retrieved from https:\/\/arxiv.org\/abs\/2408.07221"},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV51070.2023.00069"},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.00696"},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.00970"},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58568-6_26"},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00710"},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.1145\/3387926"},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.1145\/3650032"},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.00127"},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01167"},{"key":"e_1_3_2_41_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2023.3295929"},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-023-01863-1"},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00261"},{"key":"e_1_3_2_44_2","first-page":"6690","volume-title":"Advances in Neural Information Processing System","volume":"36","author":"Ma Xinhong","year":"2023","unstructured":"Xinhong Ma, Yiming Wang, Hao Liu, Tianyu Guo, and Yunhe Wang. 2023. When visual prompt tuning meets source-free domain adaptive semantic segmentation. In Advances in Neural Information Processing System, Vol. 36, 6690\u20136702."},{"key":"e_1_3_2_45_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58574-7_25"},{"key":"e_1_3_2_46_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.01225"},{"key":"e_1_3_2_47_2","doi-asserted-by":"publisher","DOI":"10.1109\/WACV48630.2021.00141"},{"key":"e_1_3_2_48_2","first-page":"585","volume-title":"BMVC","author":"Pan An-Tao","year":"2022","unstructured":"An-Tao Pan, Yawei Luo, Yi Yang, and Jun Xiao. 2022. DUDA: Online-offline dual domain adaption for semantic segmentation. In BMVC, 585."},{"key":"e_1_3_2_49_2","unstructured":"Viraj Prabhu Shivam Khare Deeksha Kartik and Judy Hoffman. 2021. AUGCO: Augmentation consistency-guided self-training for source-free domain adaptive semantic segmentation. arXiv:2107.10140. Retrieved from https:\/\/arxiv.org\/abs\/2107.10140"},{"key":"e_1_3_2_50_2","first-page":"9613","volume-title":"CVPR","author":"Teja S. Prabhu","year":"2021","unstructured":"S. Prabhu Teja and Fran\u00e7ois Fleuret. 2021. Uncertainty reduction for model adaptation in semantic segmentation. In CVPR, 9613\u20139623."},{"key":"e_1_3_2_51_2","doi-asserted-by":"publisher","DOI":"10.1145\/3649900"},{"key":"e_1_3_2_52_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46475-6_7"},{"key":"e_1_3_2_53_2","volume-title":"ICLR","author":"Rizve Mamshad Nayeem","year":"2021","unstructured":"Mamshad Nayeem Rizve, Kevin Duarte, Yogesh S. Rawat, and Mubarak Shah. 2021. Defense of pseudo-labeling: An uncertainty-aware pseudo-label selection framework for semi-supervised learning. In ICLR. Retrieved from https:\/\/openreview.net\/forum?id=-ODN6SbiUU"},{"key":"e_1_3_2_54_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.352"},{"key":"e_1_3_2_55_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01261-8_42"},{"key":"e_1_3_2_56_2","first-page":"532","volume-title":"ECCV","author":"Shin Inkyu","year":"2020","unstructured":"Inkyu Shin, Sanghyun Woo, Fei Pan, and In So Kweon. 2020. Two-phase PL densification for self-training based domain adaptation. In ECCV, 532\u2013548."},{"key":"e_1_3_2_57_2","first-page":"1195","article-title":"Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results","volume":"30","author":"Tarvainen Antti","year":"2017","unstructured":"Antti Tarvainen and Harri Valpola. 2017. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In Advances in Neural Information Processing System, Vol. 30, 1195\u20131204.","journal-title":"Advances in Neural Information Processing System"},{"key":"e_1_3_2_58_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00780"},{"key":"e_1_3_2_59_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00262"},{"key":"e_1_3_2_60_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00746"},{"key":"e_1_3_2_61_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.00840"},{"key":"e_1_3_2_62_2","doi-asserted-by":"publisher","DOI":"10.1145\/3639053"},{"key":"e_1_3_2_63_2","doi-asserted-by":"publisher","DOI":"10.1145\/3724398"},{"key":"e_1_3_2_64_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.00122"},{"key":"e_1_3_2_65_2","doi-asserted-by":"publisher","DOI":"10.1145\/3715136"},{"key":"e_1_3_2_66_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52733.2024.02704"},{"key":"e_1_3_2_67_2","doi-asserted-by":"publisher","DOI":"10.1145\/3725735"},{"key":"e_1_3_2_68_2","first-page":"12077","article-title":"SegFormer: Simple and efficient design for semantic segmentation with transformers","volume":"34","author":"Xie Enze","year":"2021","unstructured":"Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, and Ping Luo. 2021. SegFormer: Simple and efficient design for semantic segmentation with transformers. In Advances in Neural Information Processing System, Vol. 34, 12077\u201312090.","journal-title":"Advances in Neural Information Processing System"},{"key":"e_1_3_2_69_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICME52920.2022.9859581"},{"key":"e_1_3_2_70_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v37i9.26280"},{"key":"e_1_3_2_71_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00414"},{"key":"e_1_3_2_72_2","first-page":"2233","article-title":"Source data-free unsupervised domain adaptation for semantic segmentation","author":"Ye Mucong","year":"2021","unstructured":"Mucong Ye, Jing Zhang, Jinpeng Ouyang, and Ding Yuan. 2021. Source data-free unsupervised domain adaptation for semantic segmentation. In ACM International Conference on Multimedia, 2233\u20132242.","journal-title":"ACM International Conference on Multimedia"},{"key":"e_1_3_2_73_2","volume-title":"ICLR","author":"Yin Bowen","year":"2024","unstructured":"Bowen Yin, Xuying Zhang, Zhong-Yu Li, Li Liu, Ming-Ming Cheng, and Qibin Hou. 2024. DFormer: Rethinking RGBD representation learning for semantic segmentation. In ICLR."},{"key":"e_1_3_2_74_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV51070.2023.01991"},{"key":"e_1_3_2_75_2","first-page":"3293","article-title":"Domain adaptive semantic segmentation without source data","author":"You Fuming","year":"2021","unstructured":"Fuming You, Jingjing Li, Lei Zhu, Zhi Chen, and Zi Huang. 2021. Domain adaptive semantic segmentation without source data. In ACM International Conference on Multimedia, 3293\u20133302.","journal-title":"ACM International Conference on Multimedia"},{"key":"e_1_3_2_76_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00612"},{"key":"e_1_3_2_77_2","doi-asserted-by":"publisher","DOI":"10.1145\/3664647.3680567"},{"key":"e_1_3_2_78_2","doi-asserted-by":"crossref","unstructured":"Kai Zhang Yifan Sun Rui Wang Haichang Li and Xiaohui Hu. 2021. Multiple fusion adaptation: A strong framework for unsupervised semantic segmentation adaptation. arXiv:2112.00295. Retrieved from https:\/\/arxiv.org\/abs\/2112.00295","DOI":"10.5244\/C.35.153"},{"key":"e_1_3_2_79_2","first-page":"12414","volume-title":"CVPR","author":"Zhang Pan","year":"2021","unstructured":"Pan Zhang, Bo Zhang, Ting Zhang, Dong Chen, Yong Wang, and Fang Wen. 2021. Prototypical PL denoising and target structure learning for domain adaptive semantic segmentation. In CVPR, 12414\u201312424."},{"key":"e_1_3_2_80_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52733.2024.02210"},{"key":"e_1_3_2_81_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.01129"},{"key":"e_1_3_2_82_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-19815-1_31"},{"key":"e_1_3_2_83_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-023-01911-w"},{"key":"e_1_3_2_84_2","first-page":"338","article-title":"Adversarial style augmentation for domain generalized urban-scene segmentation","volume":"35","author":"Zhong Zhun","year":"2022","unstructured":"Zhun Zhong, Yuyang Zhao, Gim Hee Lee, and Nicu Sebe. 2022. Adversarial style augmentation for domain generalized urban-scene segmentation. In Advances in Neural Information Processing System, Vol. 35, 338\u2013350.","journal-title":"Advances in Neural Information Processing System"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3795886","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,14]],"date-time":"2026-03-14T13:25:06Z","timestamp":1773494706000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3795886"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,3,9]]},"references-count":83,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2026,4,30]]}},"alternative-id":["10.1145\/3795886"],"URL":"https:\/\/doi.org\/10.1145\/3795886","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,3,9]]},"assertion":[{"value":"2025-09-14","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2026-01-30","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2026-03-09","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}