{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,19]],"date-time":"2026-06-19T02:17:25Z","timestamp":1781835445954,"version":"3.54.5"},"reference-count":74,"publisher":"Association for Computing Machinery (ACM)","issue":"7","license":[{"start":{"date-parts":[[2024,5,16]],"date-time":"2024-05-16T00:00:00Z","timestamp":1715817600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["6720110, U21A20471, and U21A20472"],"award-info":[{"award-number":["6720110, U21A20471, and U21A20472"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2024,7,31]]},"abstract":"<jats:p>Nighttime semantic segmentation is an important but challenging research problem for autonomous driving. The major challenges lie in the small objects or regions from the under-\/over-exposed areas or suffer from motion blur caused by the camera deployed on moving vehicles. To resolve this, we propose a novel hard-class-aware module that bridges the main network for full-class segmentation and the hard-class network for segmenting aforementioned hard-class objects. In specific, it exploits the shared focus of hard-class objects from the dual-stream network, enabling the contextual information flow to guide the model to concentrate on the pixels that are hard to classify. In the end, the estimated hard-class segmentation results will be utilized to infer the final results via an adaptive probabilistic fusion refinement scheme. Moreover, to overcome over-smoothing and noise caused by extreme exposures, our model is modulated by a carefully crafted pretext task of constructing an exposure-aware semantic gradient map, which guides the model to faithfully perceive the structural and semantic information of hard-class objects while mitigating the negative impact of noises and uneven exposures. In experiments, we demonstrate that our unique network design leads to superior segmentation performance over existing methods, featuring the strong ability of perceiving hard-class objects under adverse conditions.<\/jats:p>","DOI":"10.1145\/3650032","type":"journal-article","created":{"date-parts":[[2024,3,4]],"date-time":"2024-03-04T12:24:09Z","timestamp":1709555049000},"page":"1-23","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":8,"title":["Learning Nighttime Semantic Segmentation the Hard Way"],"prefix":"10.1145","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3630-6322","authenticated-orcid":false,"given":"Wenxi","family":"Liu","sequence":"first","affiliation":[{"name":"College of Computer and Data Science, Fuzhou University, Fuzhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0003-5697-1746","authenticated-orcid":false,"given":"Jiaxin","family":"Cai","sequence":"additional","affiliation":[{"name":"College of Computer and Data Science, Fuzhou University, Fuzhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-7890-9220","authenticated-orcid":false,"given":"Qi","family":"Li","sequence":"additional","affiliation":[{"name":"College of Computer and Data Science, Fuzhou University, Fuzhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-7470-4193","authenticated-orcid":false,"given":"Chenyang","family":"Liao","sequence":"additional","affiliation":[{"name":"College of Computer and Data Science, Fuzhou University, Fuzhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3483-6100","authenticated-orcid":false,"given":"Jingjing","family":"Cao","sequence":"additional","affiliation":[{"name":"School of Transportation and Logistics Engineering, Wuhan University of Technology, Wuhan, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3802-4644","authenticated-orcid":false,"given":"Shengfeng","family":"He","sequence":"additional","affiliation":[{"name":"Singapore Management University, Singapore, Singapore"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2112-6214","authenticated-orcid":false,"given":"Yuanlong","family":"Yu","sequence":"additional","affiliation":[{"name":"College of Computer and Data Science, Fuzhou University, Fuzhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2024,5,16]]},"reference":[{"key":"e_1_3_1_2_2","first-page":"9157","volume-title":"Proceedings of CVPR","author":"Afifi Mahmoud","year":"2021","unstructured":"Mahmoud Afifi, Konstantinos G. Derpanis, Bjorn Ommer, and Michael S. Brown. 2021. Learning multi-scale photo exposure correction. In Proceedings of CVPR. 9157\u20139167."},{"key":"e_1_3_1_3_2","article-title":"BEiT: BERT pre-training of image transformers","author":"Bao Hangbo","year":"2021","unstructured":"Hangbo Bao, Li Dong, Songhao Piao, and Furu Wei. 2021. BEiT: BERT pre-training of image transformers. arXiv preprint arXiv:2106.08254 (2021).","journal-title":"arXiv preprint arXiv:2106.08254"},{"key":"e_1_3_1_4_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2017.2699184"},{"key":"e_1_3_1_5_2","volume-title":"Proceedings of ECCV","author":"Chen Liang-Chieh","year":"2018","unstructured":"Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. 2018. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of ECCV."},{"key":"e_1_3_1_6_2","first-page":"1597","volume-title":"Proceedings of ICML","author":"Chen Ting","year":"2020","unstructured":"Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In Proceedings of ICML. 1597\u20131607."},{"key":"e_1_3_1_7_2","first-page":"12154","volume-title":"Proceedings of CVPR","author":"Chen Ting","year":"2019","unstructured":"Ting Chen, Xiaohua Zhai, Marvin Ritter, Mario Lucic, and Neil Houlsby. 2019. Self-supervised GANs via auxiliary rotation loss. In Proceedings of CVPR. 12154\u201312163."},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1145\/3460940"},{"key":"e_1_3_1_9_2","article-title":"Complementary coarse-to-fine matching for video object segmentation","author":"Chen Zhen","year":"2023","unstructured":"Zhen Chen, Ming Yang, and Shiliang Zhang. 2023. Complementary coarse-to-fine matching for video object segmentation. ACM Transactions on Multimedia Computing, Communications and Applications 19, 6 (2023), Article 203, 21 pages.","journal-title":"ACM Transactions on Multimedia Computing, Communications and Applications"},{"key":"e_1_3_1_10_2","first-page":"3819","volume-title":"Proceedings of ITSC","author":"Dai Dengxin","year":"2018","unstructured":"Dengxin Dai and Luc Van Gool. 2018. Dark model adaptation: Semantic image segmentation from daytime to nighttime. In Proceedings of ITSC. IEEE, 3819\u20133824."},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.89"},{"key":"e_1_3_1_12_2","first-page":"16938","volume-title":"Proceedings of CVPR","author":"Deng Xueqing","year":"2022","unstructured":"Xueqing Deng, Peng Wang, Xiaochen Lian, and Shawn Newsam. 2022. NightLab: A dual-level architecture with hardness detection for segmentation at night. In Proceedings of CVPR. 16938\u201316948."},{"key":"e_1_3_1_13_2","article-title":"BERT: Pre-training of deep bidirectional transformers for language understanding","author":"Devlin Jacob","year":"2018","unstructured":"Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).","journal-title":"arXiv preprint arXiv:1810.04805"},{"key":"e_1_3_1_14_2","first-page":"2393","volume-title":"Proceedings of CVPR","author":"Ding Henghui","year":"2018","unstructured":"Henghui Ding, Xudong Jiang, Bing Shuai, Ai Qun Liu, and Gang Wang. 2018. Context contrasted feature and gated multi-scale aggregation for scene segmentation. In Proceedings of CVPR. 2393\u20132402."},{"key":"e_1_3_1_15_2","first-page":"1422","volume-title":"Proceedings of ICCV","author":"Doersch Carl","year":"2015","unstructured":"Carl Doersch, Abhinav Gupta, and Alexei A. Efros. 2015. Unsupervised visual representation learning by context prediction. In Proceedings of ICCV. 1422\u20131430."},{"key":"e_1_3_1_16_2","article-title":"An image is worth 16x16 words: Transformers for image recognition at scale","author":"Dosovitskiy Alexey","year":"2020","unstructured":"Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).","journal-title":"arXiv preprint arXiv:2010.11929"},{"key":"e_1_3_1_17_2","volume-title":"Proceedings of ICLR","author":"Dosovitskiy Alexey","year":"2021","unstructured":"Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2021. An image is worth 16x16 words: Transformers for image recognition at scale. In Proceedings of ICLR."},{"key":"e_1_3_1_18_2","first-page":"3146","volume-title":"Proceedings of CVPR","author":"Fu Jun","year":"2019","unstructured":"Jun Fu, Jing Liu, Haijie Tian, Yong Li, Yongjun Bao, Zhiwei Fang, and Hanqing Lu. 2019. Dual attention network for scene segmentation. In Proceedings of CVPR. 3146\u20133154."},{"key":"e_1_3_1_19_2","first-page":"9913","volume-title":"Proceedings of CVPR","author":"Gao Huan","year":"2022","unstructured":"Huan Gao, Jichang Guo, Guoli Wang, and Qian Zhang. 2022. Cross-domain correlation distillation for unsupervised domain adaptation in nighttime semantic segmentation. In Proceedings of CVPR. 9913\u20139923."},{"key":"e_1_3_1_20_2","article-title":"Unsupervised representation learning by predicting image rotations","author":"Gidaris Spyros","year":"2018","unstructured":"Spyros Gidaris, Praveer Singh, and Nikos Komodakis. 2018. Unsupervised representation learning by predicting image rotations. arXiv preprint arXiv:1803.07728 (2018).","journal-title":"arXiv preprint arXiv:1803.07728"},{"key":"e_1_3_1_21_2","first-page":"1780","volume-title":"Proceedings of CVPR","author":"Guo Chunle","year":"2020","unstructured":"Chunle Guo, Chongyi Li, Jichang Guo, Chen Change Loy, Junhui Hou, Sam Kwong, and Runmin Cong. 2020. Zero-reference deep curve estimation for low-light image enhancement. In Proceedings of CVPR. 1780\u20131789."},{"key":"e_1_3_1_22_2","unstructured":"Kai Han An Xiao Enhua Wu Jianyuan Guo Chunjing Xu and Yunhe Wang. 2021. Transformer in transformer. In Proceedings of NeurIPS. 15908\u201315919."},{"key":"e_1_3_1_23_2","first-page":"16000","volume-title":"Proceedings of CVPR","author":"He Kaiming","year":"2022","unstructured":"Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Doll\u00e1r, and Ross Girshick. 2022. Masked autoencoders are scalable vision learners. In Proceedings of CVPR. 16000\u201316009."},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_1_25_2","first-page":"4003","volume-title":"Proceedings of CVPR","author":"Hou Qibin","year":"2020","unstructured":"Qibin Hou, Li Zhang, Ming-Ming Cheng, and Jiashi Feng. 2020. Strip pooling: Rethinking spatial pooling for scene parsing. In Proceedings of CVPR. 4003\u20134012."},{"key":"e_1_3_1_26_2","first-page":"9924","volume-title":"Proceedings of CVPR","author":"Hoyer Lukas","year":"2022","unstructured":"Lukas Hoyer, Dengxin Dai, and Luc Van Gool. 2022. DAFormer: Improving network architectures and training strategies for domain-adaptive semantic segmentation. In Proceedings of CVPR. 9924\u20139935."},{"key":"e_1_3_1_27_2","doi-asserted-by":"crossref","unstructured":"Lukas Hoyer Dengxin Dai and Luc Van Gool. 2022. HRDA: Context-aware high-resolution domain-adaptive semantic segmentation. In Proceedings of ECCV. 372\u2013391.","DOI":"10.1007\/978-3-031-20056-4_22"},{"key":"e_1_3_1_28_2","first-page":"603","volume-title":"Proceedings of ICCV","author":"Huang Zilong","year":"2019","unstructured":"Zilong Huang, Xinggang Wang, Lichao Huang, Chang Huang, Yunchao Wei, and Wenyu Liu. 2019. CCNet: Criss-cross attention for semantic segmentation. In Proceedings of ICCV. 603\u2013612."},{"issue":"1","key":"e_1_3_1_29_2","doi-asserted-by":"crossref","first-page":"59","DOI":"10.33899\/csmj.2022.174407","article-title":"Adapted single scale retinex algorithm for nighttime image enhancement","volume":"16","author":"Ismail Mohammad Khalil","year":"2022","unstructured":"Mohammad Khalil Ismail and Zohair Al-Ameen. 2022. Adapted single scale retinex algorithm for nighttime image enhancement. AL-Rafidain Journal of Computer Sciences and Mathematics 16, 1 (2022), 59\u201369.","journal-title":"AL-Rafidain Journal of Computer Sciences and Mathematics"},{"key":"e_1_3_1_30_2","first-page":"1920","volume-title":"Proceedings of CVPR","author":"Kolesnikov Alexander","year":"2019","unstructured":"Alexander Kolesnikov, Xiaohua Zhai, and Lucas Beyer. 2019. Revisiting self-supervised visual representation learning. In Proceedings of CVPR. 1920\u20131929."},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","DOI":"10.1038\/scientificamerican1277-108"},{"key":"e_1_3_1_32_2","first-page":"577","volume-title":"Proceedings of ECCV","author":"Larsson Gustav","year":"2016","unstructured":"Gustav Larsson, Michael Maire, and Gregory Shakhnarovich. 2016. Learning representations for automatic colorization. In Proceedings of ECCV. 577\u2013593."},{"key":"e_1_3_1_33_2","first-page":"7252","volume-title":"Proceedings of ICCV","author":"Li Qi","year":"2021","unstructured":"Qi Li, Weixiang Yang, Wenxi Liu, Yuanlong Yu, and Shengfeng He. 2021. From contexts to locality: Ultra-high resolution image segmentation via locality-aware contextual correlation. In Proceedings of ICCV. 7252\u20137261."},{"key":"e_1_3_1_34_2","first-page":"9167","volume-title":"Proceedings of ICCV","author":"Li Xia","year":"2019","unstructured":"Xia Li, Zhisheng Zhong, Jianlong Wu, Yibo Yang, Zhouchen Lin, and Hong Liu. 2019. Expectation-maximization attention networks for semantic segmentation. In Proceedings of ICCV. 9167\u20139176."},{"key":"e_1_3_1_35_2","article-title":"Benchmarking detection transfer learning with vision transformers","author":"Li Yanghao","year":"2021","unstructured":"Yanghao Li, Saining Xie, Xinlei Chen, Piotr Dollar, Kaiming He, and Ross Girshick. 2021. Benchmarking detection transfer learning with vision transformers. arXiv preprint arXiv:2111.11429 (2021).","journal-title":"arXiv preprint arXiv:2111.11429"},{"issue":"24","key":"e_1_3_1_36_2","doi-asserted-by":"crossref","first-page":"7143","DOI":"10.1016\/j.ijleo.2014.07.118","article-title":"Multi-scale retinex improvement for nighttime image enhancement","volume":"125","author":"Lin Haoning","year":"2014","unstructured":"Haoning Lin and Zhenwei Shi. 2014. Multi-scale retinex improvement for nighttime image enhancement. Optik 125, 24 (2014), 7143\u20137148.","journal-title":"Optik"},{"key":"e_1_3_1_37_2","unstructured":"Sifei Liu Shalini De Mello Jinwei Gu Guangyu Zhong Ming-Hsuan Yang and Jan Kautz. 2017. Learning affinity via spatial propagation networks. In Proceedings of NIPS."},{"key":"e_1_3_1_38_2","article-title":"Improving nighttime driving-scene segmentation via dual image-adaptive learnable filters","author":"Liu Wenyu","year":"2023","unstructured":"Wenyu Liu, Wentong Li, Jianke Zhu, Miaomiao Cui, Xuansong Xie, and Lei Zhang. 2023. Improving nighttime driving-scene segmentation via dual image-adaptive learnable filters. IEEE Transactions on Circuits and Systems for Video Technology. Published Online, October 2023.","journal-title":"IEEE Transactions on Circuits and Systems for Video Technology."},{"issue":"1","key":"e_1_3_1_39_2","first-page":"1","article-title":"Affinity derivation for accurate instance segmentation","volume":"17","author":"Liu Yiding","year":"2021","unstructured":"Yiding Liu, Siyu Yang, Bin Li, Wengang Zhou, Jizheng Xu, Houqiang Li, and Yan Lu. 2021. Affinity derivation for accurate instance segmentation. ACM Transactions on Multimedia Computing, Communications, and Applications 17, 1 (2021), 1\u201320.","journal-title":"ACM Transactions on Multimedia Computing, Communications, and Applications"},{"key":"e_1_3_1_40_2","first-page":"10012","volume-title":"Proceedings of ICCV","author":"Liu Ze","year":"2021","unstructured":"Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of ICCV. 10012\u201310022."},{"key":"e_1_3_1_41_2","first-page":"3431","volume-title":"Proceedings of CVPR","author":"Long Jonathan","year":"2015","unstructured":"Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In Proceedings of CVPR. 3431\u20133440."},{"key":"e_1_3_1_42_2","article-title":"Efficient estimation of word representations in vector space","author":"Mikolov Tomas","year":"2013","unstructured":"Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).","journal-title":"arXiv preprint arXiv:1301.3781"},{"key":"e_1_3_1_43_2","doi-asserted-by":"publisher","DOI":"10.1145\/3558770"},{"key":"e_1_3_1_44_2","first-page":"2536","volume-title":"Proceedings of CVPR","author":"Pathak Deepak","year":"2016","unstructured":"Deepak Pathak, Philipp Krahenbuhl, Jeff Donahue, Trevor Darrell, and Alexei A. Efros. 2016. Context encoders: Feature learning by inpainting. In Proceedings of CVPR. 2536\u20132544."},{"key":"e_1_3_1_45_2","doi-asserted-by":"publisher","DOI":"10.1145\/3376922"},{"key":"e_1_3_1_46_2","unstructured":"Alec Radford Karthik Narasimhan Tim Salimans and Ilya Sutskever. 2018. Improving language understanding by generative pre-training. Preprint."},{"issue":"8","key":"e_1_3_1_47_2","first-page":"9","article-title":"Language models are unsupervised multitask learners","volume":"1","author":"Radford Alec","year":"2019","unstructured":"Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever2019. Language models are unsupervised multitask learners. OpenAI Blog 1, 8 (2019), 9.","journal-title":"OpenAI Blog"},{"key":"e_1_3_1_48_2","first-page":"1312","volume-title":"Proceedings of","author":"Romera Eduardo","year":"2019","unstructured":"Eduardo Romera, Luis M. Bergasa, Kailun Yang, Jose M. Alvarez, and Rafael Barea. 2019. Bridging the day and night domain gap for semantic segmentation. In Proceedings of IV. IEEE, 1312\u20131318."},{"key":"e_1_3_1_49_2","first-page":"7374","volume-title":"Proceedings of ICCV","author":"Sakaridis Christos","year":"2019","unstructured":"Christos Sakaridis, Dengxin Dai, and Luc Van Gool. 2019. Guided curriculum model adaptation and uncertainty-aware evaluation for semantic nighttime image segmentation. In Proceedings of ICCV. 7374\u20137383."},{"key":"e_1_3_1_50_2","article-title":"Map-guided curriculum domain adaptation and uncertainty-aware evaluation for semantic nighttime image segmentation","author":"Sakaridis Christos","year":"2020","unstructured":"Christos Sakaridis, Dengxin Dai, and Luc Van Gool. 2020. Map-guided curriculum domain adaptation and uncertainty-aware evaluation for semantic nighttime image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence. Published Online, December 18, 2020.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence."},{"key":"e_1_3_1_51_2","first-page":"77","volume-title":"Artificial Intelligence and Machine Learning in Defense Applications","author":"Sun Lei","year":"2019","unstructured":"Lei Sun, Kaiwei Wang, Kailun Yang, and Kaite Xiang. 2019. See clearer at night: Towards robust nighttime semantic segmentation through day-night image conversion. In Artificial Intelligence and Machine Learning in Defense Applications. Vol. 11169. SPIE, 77\u201389."},{"key":"e_1_3_1_52_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2021.3122004"},{"key":"e_1_3_1_53_2","first-page":"2517","volume-title":"Proceedings of CVPR","author":"Vu Tuan-Hung","year":"2019","unstructured":"Tuan-Hung Vu, Himalaya Jain, Maxime Bucher, Matthieu Cord, and Patrick P\u00e9rez. 2019. Advent: Adversarial entropy minimization for domain adaptation in semantic segmentation. In Proceedings of CVPR. 2517\u20132526."},{"key":"e_1_3_1_54_2","unstructured":"Jingdong Wang Ke Sun Tianheng Cheng Borui Jiang Chaorui Deng Yang Zhao Dong Liu Yadong Mu Mingkui Tan Xinggang Wang Wenyu Liu and Bin Xiao. 2019. Deep high-resolution representation learning for visual recognition. In Proceedings of CVPR."},{"key":"e_1_3_1_55_2","first-page":"568","volume-title":"Proceedings of ICCV","author":"Wang Wenhai","year":"2021","unstructured":"Wenhai Wang, Enze Xie, Xiang Li, Deng-Ping Fan, Kaitao Song, Ding Liang, Tong Lu, Ping Luo, and Ling Shao. 2021. Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. In Proceedings of ICCV. 568\u2013578."},{"key":"e_1_3_1_56_2","first-page":"7794","volume-title":"Proceedings of CVPR","author":"Wang Xiaolong","year":"2018","unstructured":"Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He. 2018. Non-local neural networks. In Proceedings of CVPR. 7794\u20137803."},{"key":"e_1_3_1_57_2","doi-asserted-by":"crossref","unstructured":"Sanghyun Woo Jongchan Park Joon-Young Lee and In So Kweon. 2018. CBAM: Convolutional block attention module. In Proceedings of ECCV. 3\u201319.","DOI":"10.1007\/978-3-030-01234-2_1"},{"key":"e_1_3_1_58_2","first-page":"15769","volume-title":"Proceedings of CVPR","author":"Wu Xinyi","year":"2021","unstructured":"Xinyi Wu, Zhenyao Wu, Hao Guo, Lili Ju, and Song Wang. 2021. DANNet: A one-stage domain adaptation network for unsupervised nighttime semantic segmentation. In Proceedings of CVPR. 15769\u201315778."},{"key":"e_1_3_1_59_2","article-title":"A one-stage domain adaptation network with image alignment for unsupervised nighttime semantic segmentation","author":"Wu Xinyi","year":"2021","unstructured":"Xinyi Wu, Zhenyao Wu, Lili Ju, and Song Wang. 2021. A one-stage domain adaptation network with image alignment for unsupervised nighttime semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence. Published Online, December 28, 2021.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence."},{"issue":"1","key":"e_1_3_1_60_2","first-page":"Article 15, 19","article-title":"A weakly supervised semantic segmentation network by aggregating seed cues: The multi-object proposal generation perspective","volume":"17","author":"Xiao Junsheng","year":"2021","unstructured":"Junsheng Xiao, Huahu Xu, Honghao Gao, Minjie Bian, and Yang Li. 2021. A weakly supervised semantic segmentation network by aggregating seed cues: The multi-object proposal generation perspective. ACM Transactions on Multimedia Computing, Communications, and Applications 17, 1s (2021), Article 15, 19 pages.","journal-title":"ACM Transactions on Multimedia Computing, Communications, and Applications"},{"key":"e_1_3_1_61_2","first-page":"418","volume-title":"Proceedings of ECCV","author":"Xiao Tete","year":"2018","unstructured":"Tete Xiao, Yingcheng Liu, Bolei Zhou, Yuning Jiang, and Jian Sun. 2018. Unified perceptual parsing for scene understanding. In Proceedings of ECCV. 418\u2013434."},{"key":"e_1_3_1_62_2","unstructured":"Enze Xie Wenhai Wang Zhiding Yu Anima Anandkumar Jose M. Alvarez and Ping Luo. 2021. SegFormer: Simple and efficient design for semantic segmentation with transformers. In Proceedings of NeurIPS. 12077\u201312090."},{"key":"e_1_3_1_63_2","article-title":"Boosting night-time scene parsing with learnable frequency","author":"Xie Zhifeng","year":"2023","unstructured":"Zhifeng Xie, Sen Wang, Ke Xu, Zhizhong Zhang, Xin Tan, Yuan Xie, and Lizhuang Ma. 2023. Boosting night-time scene parsing with learnable frequency. IEEE Transactions on Image Processing. Published Online, April 2023.","journal-title":"IEEE Transactions on Image Processing."},{"key":"e_1_3_1_64_2","first-page":"2962","volume-title":"Proceedings of ICCVW","author":"Xu Qi","year":"2021","unstructured":"Qi Xu, Yinan Ma, Jing Wu, Chengnian Long, and Xiaolin Huang. 2021. CDAda: A curriculum domain adaptation for nighttime semantic segmentation. In Proceedings of ICCVW. 2962\u20132971."},{"key":"e_1_3_1_65_2","first-page":"191","volume-title":"Proceedings of ECCV","author":"Yin Minghao","year":"2020","unstructured":"Minghao Yin, Zhuliang Yao, Yue Cao, Xiu Li, Zheng Zhang, Stephen Lin, and Han Hu. 2020. Disentangled non-local neural networks. In Proceedings of ECCV. 191\u2013207."},{"key":"e_1_3_1_66_2","first-page":"2636","volume-title":"Proceedings of CVPR","author":"Yu Fisher","year":"2020","unstructured":"Fisher Yu, Haofeng Chen, Xin Wang, Wenqi Xian, Yingying Chen, Fangchen Liu, Vashisht Madhavan, and Trevor Darrell. 2020. BDD100k: A diverse driving dataset for heterogeneous multitask learning. In Proceedings of CVPR. 2636\u20132645."},{"key":"e_1_3_1_67_2","doi-asserted-by":"publisher","DOI":"10.1145\/3321512"},{"key":"e_1_3_1_68_2","article-title":"OCNet: Object context network for scene parsing","author":"Yuan Yuhui","year":"2018","unstructured":"Yuhui Yuan, Lang Huang, Jianyuan Guo, Chao Zhang, Xilin Chen, and Jingdong Wang. 2018. OCNet: Object context network for scene parsing. arXiv preprint arXiv:1809.00916 (2018).","journal-title":"arXiv preprint arXiv:1809.00916"},{"key":"e_1_3_1_69_2","first-page":"7151","volume-title":"Proceedings of CVPR","author":"Zhang Hang","year":"2018","unstructured":"Hang Zhang, Kristin Dana, Jianping Shi, Zhongyue Zhang, Xiaogang Wang, Ambrish Tyagi, and Amit Agrawal. 2018. Context encoding for semantic segmentation. In Proceedings of CVPR. 7151\u20137160."},{"key":"e_1_3_1_70_2","first-page":"548","volume-title":"Proceedings of CVPR","author":"Zhang Hang","year":"2019","unstructured":"Hang Zhang, Han Zhang, Chenguang Wang, and Junyuan Xie. 2019. Co-occurrent features in semantic segmentation. In Proceedings of CVPR. 548\u2013557."},{"key":"e_1_3_1_71_2","first-page":"649","volume-title":"Proceedings of ECCV","author":"Zhang Richard","year":"2016","unstructured":"Richard Zhang, Phillip Isola, and Alexei A. Efros. 2016. Colorful image colorization. In Proceedings of ECCV. 649\u2013666."},{"key":"e_1_3_1_72_2","first-page":"2881","volume-title":"Proceedings of CVPR","author":"Zhao Hengshuang","year":"2017","unstructured":"Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. 2017. Pyramid scene parsing network. In Proceedings of CVPR. 2881\u20132890."},{"key":"e_1_3_1_73_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01240-3_17"},{"key":"e_1_3_1_74_2","first-page":"6881","volume-title":"Proceedings of CVPR","author":"Zheng Sixiao","year":"2021","unstructured":"Sixiao Zheng, Jiachen Lu, Hengshuang Zhao, Xiatian Zhu, Zekun Luo, Yabiao Wang, Yanwei Fu, Jianfeng Feng, Tao Xiang, Philip H. S. Torr, and Li Zhang. 2021. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In Proceedings of CVPR. 6881\u20136890."},{"issue":"2","key":"e_1_3_1_75_2","first-page":"492","article-title":"To see in the dark: N2DGAN for background modeling in nighttime scene","volume":"31","author":"Zhu Zhenfeng","year":"2020","unstructured":"Zhenfeng Zhu, Yingying Meng, Deqiang Kong, Xingxing Zhang, Yandong Guo, and Yao Zhao. 2020. To see in the dark: N2DGAN for background modeling in nighttime scene. IEEE Transactions on Circuits and Systems for Video Technology 31, 2 (2020), 492\u2013502.","journal-title":"IEEE Transactions on Circuits and Systems for Video Technology"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3650032","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3650032","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T00:03:43Z","timestamp":1750291423000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3650032"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,5,16]]},"references-count":74,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2024,7,31]]}},"alternative-id":["10.1145\/3650032"],"URL":"https:\/\/doi.org\/10.1145\/3650032","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,5,16]]},"assertion":[{"value":"2023-07-09","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-02-16","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-05-16","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}