{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,9]],"date-time":"2026-04-09T14:37:40Z","timestamp":1775745460193,"version":"3.50.1"},"reference-count":97,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2025,4,8]],"date-time":"2025-04-08T00:00:00Z","timestamp":1744070400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["623B2073, 62301310, 62101326, 62225112, and 62301316"],"award-info":[{"award-number":["623B2073, 62301310, 62101326, 62225112, and 62301316"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Sichuan Science and Technology Program","award":["2024NSFSC1426"],"award-info":[{"award-number":["2024NSFSC1426"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2025,4,30]]},"abstract":"<jats:p>The importance of visual quality in point clouds has been significantly underlined due to the rapid rise in 3D vision applications which aim to deliver affordable and superior user experiences. Reviewing the evolution of point cloud quality assessment (PCQA), it\u2019s observed that visual quality evaluation typically employs single-modal data, either sourced from 2D projections or the 3D point clouds. The 2D projections possess abundant texture and semantic information while they are heavily reliant on viewpoints. In contrast, 3D point clouds are more reactive to geometric distortions and viewpoint-invariant. Consequently, to maximize the benefits of both point cloud and image modalities, we present an advanced no-reference Multi-Modal Point Cloud Quality Assessment (MM-PCQA+) metric. Specifically, we divide the point clouds into sub-models to reflect local geometric distortions such as point shifting and down-sampling. Afterwards, we render the point clouds using a cube-like projection setup and sample the projections of interest using a point-visible-ratio for image feature extraction. In order to fulfill these objectives, the sub-models and projected images are encoded using point-based and image-based neural networks. Lastly, we implement symmetric cross-modal attention to amalgamate multi-modal quality-aware features. Experimental results demonstrate that our metric surpasses all state-of-the-art methods and significantly advances beyond previous no-reference PCQA methods.<\/jats:p>","DOI":"10.1145\/3715134","type":"journal-article","created":{"date-parts":[[2025,1,24]],"date-time":"2025-01-24T15:51:50Z","timestamp":1737733910000},"page":"1-22","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":5,"title":["MM-PCQA+: Advancing Multi-Modal Learning for Point Cloud Quality Assessment"],"prefix":"10.1145","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7247-7938","authenticated-orcid":false,"given":"Zicheng","family":"Zhang","sequence":"first","affiliation":[{"name":"Shanghai Jiao Tong University, Shanghai, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-3915-8257","authenticated-orcid":false,"given":"Yingjie","family":"Zhou","sequence":"additional","affiliation":[{"name":"Shanghai Jiao Tong University, Shanghai, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-0634-1710","authenticated-orcid":false,"given":"Chunyi","family":"Li","sequence":"additional","affiliation":[{"name":"Shanghai Jiao Tong University, Shanghai, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8162-1949","authenticated-orcid":false,"given":"Wei","family":"Sun","sequence":"additional","affiliation":[{"name":"Shanghai Jiao Tong University, Shanghai, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5693-0416","authenticated-orcid":false,"given":"Xiongkuo","family":"Min","sequence":"additional","affiliation":[{"name":"Department of Electronic Engineering, Shanghai Jiao Tong University, Shanghai, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6377-4730","authenticated-orcid":false,"given":"Xiaohong","family":"Liu","sequence":"additional","affiliation":[{"name":"Shanghai Jiao Tong University, Shanghai, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8165-9322","authenticated-orcid":false,"given":"Guangtao","family":"Zhai","sequence":"additional","affiliation":[{"name":"Shanghai Jiao Tong University, Shanghai, China"}]}],"member":"320","published-online":{"date-parts":[[2025,4,8]]},"reference":[{"key":"e_1_3_1_2_2","volume-title":"IEEE\/CVF International Conference on Computer Vision","author":"Afham Mohamed","year":"2022","unstructured":"Mohamed Afham, Isuru Dissanayake, Dinithi Dissanayake, Amaya Dharmasiri, Kanchana Thilakarathna, and Ranga Rodrigo. 2022. Crosspoint: Self-supervised cross-modal contrastive learning for 3d point cloud understanding. In IEEE\/CVF International Conference on Computer Vision."},{"key":"e_1_3_1_3_2","first-page":"1","volume-title":"International Conference on Multimedia and Expo Workshop","author":"Alexiou Evangelos","year":"2020","unstructured":"Evangelos Alexiou and Touradj Ebrahimi. 2020. Towards a point cloud structural similarity metric. In International Conference on Multimedia and Expo Workshop, 1\u20136."},{"key":"e_1_3_1_4_2","unstructured":"Jochen Antkowiak T. D. F Jamal Baina France Vittorio Baroncini Noel Chateau France FranceTelecom Antonio Claudio Fran\u00e7a Pessoa FUB Stephanie Colonnese Italy Laura Contin Jorge Caviedes and France Philips. 2000. Final report from the video quality experts group on the validation of objective models of video quality assessment march 2000."},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58589-1_5"},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.displa.2023.102540"},{"key":"e_1_3_1_7_2","unstructured":"Zijian Chen Wei Sun Yuan Tian Jun Jia Zicheng Zhang Jiarui Wang Ru Huang Xiongkuo Min Guangtao Zhai and Wenjun Zhang. 2024. GAIA: Rethinking action quality assessment for AI-generated videos. arXiv:2406.06087. Retrieved from https:\/\/arxiv.org\/abs\/2406.06087"},{"key":"e_1_3_1_8_2","unstructured":"Zijian Chen Wei Sun Haoning Wu Zicheng Zhang Jun Jia Xiongkuo Min Guangtao Zhai and Wenjun Zhang. 2023. Exploring the naturalness of AI-generated images. arXiv:2312.05476. Retrieved from https:\/\/arxiv.org\/abs\/2312.05476"},{"key":"e_1_3_1_9_2","volume-title":"35th AAAI Conference on Artificial Intelligence (AAAI \u201921)","author":"Cheng Mingmei","year":"2021","unstructured":"Mingmei Cheng, Le Hui, Jin Xie, and Jian Yang. 2021. SSPC-Net: Semi-supervised semantic 3D point cloud segmentation network. In 35th AAAI Conference on Artificial Intelligence (AAAI \u201921)."},{"key":"e_1_3_1_10_2","first-page":"3884","volume-title":"ACM International Conference on Multimedia","author":"Cheng Ying","year":"2020","unstructured":"Ying Cheng, Ruize Wang, Zhihao Pan, Rui Feng, and Yuejie Zhang. 2020. Look, listen, and attend: Co-attention network for self-supervised audio-visual representation learning. In ACM International Conference on Multimedia, 3884\u20133892."},{"key":"e_1_3_1_11_2","first-page":"1","volume-title":"International Conference on Multimedia and Expo Workshop","author":"Chetouani Aladine","year":"2021","unstructured":"Aladine Chetouani, Maurice Quach, Giuseppe Valenzise, and Fr\u00e9d\u00e9ric Dufaux. 2021. Deep learning-based quality assessment of 3d point clouds without reference. In International Conference on Multimedia and Expo Workshop, 1\u20136."},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2020.3023541"},{"key":"e_1_3_1_13_2","volume-title":"IEEE\/CVF International Conference on Computer Vision","author":"Deng Jia","year":"2009","unstructured":"Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In IEEE\/CVF International Conference on Computer Vision."},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2022\/126"},{"key":"e_1_3_1_15_2","first-page":"1","volume-title":"2022 IEEE 24th International Workshop on Multimedia Signal Processing (IEEE MMSP)","author":"Fan Yu","year":"2022","unstructured":"Yu Fan, Zicheng Zhang, Wei Sun, Xiongkuo Min, Ning Liu, Quan Zhou, Jun He, Qiyuan Wang, and Guangtao Zhai. 2022. A no-reference quality assessment metric for point cloud based on captured video sequences. In 2022 IEEE 24th International Workshop on Multimedia Signal Processing (IEEE MMSP). IEEE, 1\u20135."},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.1017\/ATSIP.2020.12"},{"key":"e_1_3_1_17_2","doi-asserted-by":"crossref","first-page":"339","DOI":"10.5194\/isprs-archives-XLII-2-W3-339-2017","article-title":"A review of point clouds segmentation and classification algorithms","volume":"42","author":"Grilli Eleonora","year":"2017","unstructured":"Eleonora Grilli, Fabio Menna, and Fabio Remondino. 2017. A review of point clouds segmentation and classification algorithms. In The International Archives of Photogrammetry Remote Sensing and Spatial Information Sciences, Vol. 42, 339.","journal-title":"The International Archives of Photogrammetry Remote Sensing and Spatial Information Sciences"},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2017.2649101"},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2019.2936738"},{"key":"e_1_3_1_20_2","first-page":"770","volume-title":"IEEE\/CVF International Conference on Computer Vision","author":"He Kaiming","year":"2016","unstructured":"Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In IEEE\/CVF International Conference on Computer Vision, 770\u2013778."},{"key":"e_1_3_1_21_2","first-page":"24\u2013es","article-title":"Direct visibility of point sets","author":"Katz Sagi","year":"2007","unstructured":"Sagi Katz, Ayellet Tal, and Ronen Basri. 2007. Direct visibility of point sets. In ACM SIGGRAPH 2007 Papers (SIGGRAPH \u201907), 24\u2013es.","journal-title":"ACM SIGGRAPH 2007 Papers (SIGGRAPH \u201907)"},{"key":"e_1_3_1_22_2","volume-title":"International Conference on Learning Representations","author":"Kingma Diederik P.","year":"2015","unstructured":"Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In International Conference on Learning Representations."},{"key":"e_1_3_1_23_2","unstructured":"Tengchuan Kou Xiaohong Liu Zicheng Zhang Chunyi Li Haoning Wu Xiongkuo Min Guangtao Zhai and Ning Liu. 2024. Subjective-aligned dateset and metric for text-to-video quality assessment. arXiv:2403.11956. Retrieved from https:\/\/arxiv.org\/abs\/2403.11956"},{"key":"e_1_3_1_24_2","first-page":"1","volume-title":"IEEE\/RSJ International Conference on Intelligent Robots and Systems","author":"Ku Jason","year":"2018","unstructured":"Jason Ku, Melissa Mozifian, Jungwook Lee, Ali Harakeh, and Steven L. Waslander. 2018. Joint 3d proposal generation and object detection from view aggregation. In IEEE\/RSJ International Conference on Intelligent Robots and Systems, 1\u20138."},{"key":"e_1_3_1_25_2","unstructured":"Chunyi Li Tengchuan Kou Yixuan Gao Yuqin Cao Wei Sun Zicheng Zhang Yingjie Zhou Zhichao Zhang Weixia Zhang Haoning Wu et al. 2024. Aigiqa-20k: A large database for ai-generated image quality assessment. arXiv:2404.03407. Retrieved from https:\/\/arxiv.org\/abs\/2404.03407"},{"key":"e_1_3_1_26_2","unstructured":"Chunyi Li Haoning Wu Zicheng Zhang Hongkun Hao Kaiwei Zhang Lei Bai Xiaohong Liu Xiongkuo Min Weisi Lin and Guangtao Zhai. 2024. Q-Refine: A perceptual quality refiner for AI-generated image. arXiv:2401.01117. Retrieved from https:\/\/arxiv.org\/abs\/2401.01117"},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2022.3167151"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2021.3096060"},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2020.3023294"},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2021.3100282"},{"issue":"1","key":"e_1_3_1_31_2","first-page":"107","article-title":"A paraboost method to image quality assessment","volume":"28","author":"Liu Tsung-Jung","year":"2015","unstructured":"Tsung-Jung Liu, Kuan-Hsien Liu, Joe Yuchieh Lin, Weisi Lin, and C.-C. Jay Kuo. 2015. A paraboost method to image quality assessment. IEEE Transactions on Neural Networks and Learning Systems 28, 1 (2015), 107\u2013121.","journal-title":"IEEE Transactions on Neural Networks and Learning Systems"},{"key":"e_1_3_1_32_2","volume-title":"International Joint Conference on Artificial Intelligence","author":"Liu Weiquan","year":"2022","unstructured":"Weiquan Liu, Hanyun Guo, Weini Zhang, Yu Zang, Cheng Wang, and Jonathan Li. 2022a. TopoSeg: Topology-aware segmentation for point clouds. In International Joint Conference on Artificial Intelligence (2022)."},{"key":"e_1_3_1_33_2","unstructured":"Xiaohong Liu Xiongkuo Min Guangtao Zhai Chunyi Li Tengchuan Kou Wei Sun Haoning Wu Yixuan Gao Yuqin Cao Zicheng Zhang et al. 2024. NTIRE 2024 Quality Assessment of AI-Generated Content Challenge. arXiv:2404.16687. Retrieved from https:\/\/arxiv.org\/abs\/2404.16687"},{"issue":"2","key":"e_1_3_1_34_2","first-page":"1","article-title":"Point cloud quality assessment: Dataset construction and learning-based no-reference metric","volume":"19","author":"Liu Yipeng","year":"2022","unstructured":"Yipeng Liu, Qi Yang, Yiling Xu, and Le Yang. 2022. Point cloud quality assessment: Dataset construction and learning-based no-reference metric. ACM Transactions on Multimedia Computing, Communications, and Applications 19, 2s (2022), 1\u201326.","journal-title":"ACM Transactions on Multimedia Computing, Communications, and Applications"},{"key":"e_1_3_1_35_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2016.2543039"},{"key":"e_1_3_1_37_2","article-title":"Evaluation criteria for point cloud compression","volume":"16332","author":"Mekuria R.","year":"2016","unstructured":"R. Mekuria, Z. Li, C. Tulvan, and P. Chou. 2016. Evaluation criteria for point cloud compression. ISO\/IEC MPEG 16332.","journal-title":"ISO\/IEC MPEG"},{"key":"e_1_3_1_38_2","first-page":"1","volume-title":"International Workshop on Quality of Multimedia","author":"Meynet Gabriel","year":"2020","unstructured":"Gabriel Meynet, Yana Nehm\u00e9, Julie Digne, and Guillaume Lavou\u00e9. 2020. PCQM: A full-reference quality metric for colored 3D point clouds. In International Workshop on Quality of Multimedia, 1\u20136."},{"key":"e_1_3_1_39_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2012.2214050"},{"key":"e_1_3_1_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/LSP.2012.2227726"},{"key":"e_1_3_1_41_2","first-page":"117","volume-title":"IEEE\/ACM International Symposium on Mixed and Augmented Reality","author":"Park Youngmin","year":"2008","unstructured":"Youngmin Park, Vincent Lepetit, and Woontack Woo. 2008. Multiple 3d object tracking for augmented reality. In IEEE\/ACM International Symposium on Mixed and Augmented Reality, 117\u2013120."},{"key":"e_1_3_1_42_2","first-page":"918","volume-title":"IEEE\/CVF International Conference on Computer Vision","author":"Qi Charles R.","year":"2018","unstructured":"Charles R. Qi, Wei Liu, Chenxia Wu, Hao Su, and Leonidas J. Guibas. 2018. Frustum pointnets for 3d object detection from RGB-D data. In IEEE\/CVF International Conference on Computer Vision, 918\u2013927."},{"key":"e_1_3_1_43_2","first-page":"5105","volume-title":"31st International Conference on Neural Information Processing Systems (NIPS)","author":"Qi Charles Ruizhongtai","year":"2017","unstructured":"Charles Ruizhongtai Qi, Li Yi, Hao Su, and Leonidas J. Guibas. 2017. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In 31st International Conference on Neural Information Processing Systems (NIPS), 5105\u20135114."},{"key":"e_1_3_1_44_2","first-page":"8748","volume-title":"International Conference on Learning Representations","author":"Radford Alec","year":"2021","unstructured":"Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. 2021. Learning transferable visual models from natural language supervision. In International Conference on Learning Representations, 8748\u20138763."},{"key":"e_1_3_1_45_2","first-page":"4510","volume-title":"IEEE\/CVF International Conference on Computer Vision","author":"Sandler Mark","year":"2018","unstructured":"Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. 2018. Mobilenetv2: Inverted residuals and linear bottlenecks. In IEEE\/CVF International Conference on Computer Vision, 4510\u20134520."},{"key":"e_1_3_1_46_2","unstructured":"Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556. Retrieved from https:\/\/arxiv.org\/abs\/1409.1556"},{"key":"e_1_3_1_47_2","doi-asserted-by":"publisher","DOI":"10.1145\/3503161.3548329"},{"key":"e_1_3_1_48_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICIP.2017.8296925"},{"key":"e_1_3_1_49_2","first-page":"174","article-title":"A novel methodology for quality assessment of voxelized point clouds","volume":"10752","author":"Torlig Eric M.","year":"2018","unstructured":"Eric M. Torlig, Evangelos Alexiou, Tiago A. Fonseca, Ricardo L. de Queiroz, and Touradj Ebrahimi. 2018. A novel methodology for quality assessment of voxelized point clouds. In Applications of Digital Image Processing XLI, Vol. 10752, 174\u2013190.","journal-title":"Applications of Digital Image Processing XLI"},{"key":"e_1_3_1_50_2","first-page":"4604","volume-title":"IEEE\/CVF International Conference on Computer Vision","author":"Vora Sourabh","year":"2020","unstructured":"Sourabh Vora, Alex H. Lang, Bassam Helou, and Oscar Beijbom. 2020. Pointpainting: Sequential fusion for 3d object detection. In IEEE\/CVF International Conference on Computer Vision, 4604\u20134612."},{"key":"e_1_3_1_51_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2024.3362369"},{"key":"e_1_3_1_52_2","doi-asserted-by":"crossref","unstructured":"Puyi Wang Wei Sun Zicheng Zhang Jun Jia Yanwei Jiang Zhichao Zhang Xiongkuo Min and Guangtao Zhai. 2024. Large multi-modality model assisted AI-generated image quality assessment. arXiv:2404.17762. Retrieved from https:\/\/arxiv.org\/abs\/2404.17762","DOI":"10.1145\/3664647.3681471"},{"key":"e_1_3_1_53_2","doi-asserted-by":"publisher","DOI":"10.1109\/VCIP53242.2021.9675430"},{"issue":"4","key":"e_1_3_1_54_2","doi-asserted-by":"crossref","first-page":"658","DOI":"10.1109\/TG.2022.3212201","article-title":"A deep learning-based multidimensional aesthetic quality assessment method for mobile game images","volume":"15","author":"Wang Tao","year":"2022","unstructured":"Tao Wang, Wei Sun, Wei Wu, Ying Chen, Xiongkuo Min, Wei Lu, Zicheng Zhang, and Guangtao Zhai. 2022. A deep learning-based multidimensional aesthetic quality assessment method for mobile game images. IEEE Transactions on Games 15, 4 (2022), 658\u2013668.","journal-title":"IEEE Transactions on Games"},{"key":"e_1_3_1_55_2","doi-asserted-by":"publisher","DOI":"10.1145\/3355089.3356520"},{"key":"e_1_3_1_56_2","first-page":"1742","volume-title":"IEEE\/RSJ International Conference on Intelligent Robots and Systems","author":"Wang Zhixin","year":"2019","unstructured":"Zhixin Wang and Kui Jia. 2019. Frustum convnet: Sliding frustums to aggregate local point-wise features for amodal 3d object detection. In IEEE\/RSJ International Conference on Intelligent Robots and Systems, 1742\u20131749."},{"key":"e_1_3_1_57_2","volume-title":"International Conference on Learning Representations","author":"Wu Haoning","year":"2024","unstructured":"Haoning Wu, Zicheng Zhang, Erli Zhang, Chaofeng Chen, Liang Liao, Annan Wang, Chunyi Li, Wenxiu Sun, Qiong Yan, Guangtao Zhai, et al. 2024. Q-Bench: A benchmark for general-purpose foundation models on low-level vision. In International Conference on Learning Representations 2024."},{"key":"e_1_3_1_58_2","unstructured":"Haoning Wu Zicheng Zhang Erli Zhang Chaofeng Chen Liang Liao Annan Wang Kaixin Xu Chunyi Li Jingwen Hou Guangtao Zhai et al. 2023. Q-instruct: Improving low-level visual abilities for multi-modality foundation models. arXiv:2311.06783. Retrieved from https:\/\/arxiv.org\/abs\/2311.06783"},{"key":"e_1_3_1_59_2","unstructured":"Haoning Wu Zicheng Zhang Weixia Zhang Chaofeng Chen Liang Liao Chunyi Li Yixuan Gao Annan Wang Erli Zhang Wenxiu Sun et al. 2023. Q-align: Teaching lmms for visual scoring via discrete text-defined levels. arXiv:2312.17090. Retrieved from https:\/\/arxiv.org\/abs\/2312.17090"},{"key":"e_1_3_1_60_2","unstructured":"Haoning Wu Hanwei Zhu Zicheng Zhang Erli Zhang Chaofeng Chen Liang Liao Chunyi Li Annan Wang Wenxiu Sun Qiong Yan et al. 2024. Towards open-ended visual quality comparison. arXiv:2402.16641. Retrieved from https:\/\/arxiv.org\/abs\/2402.16641"},{"key":"e_1_3_1_61_2","volume-title":"36th International Conference on Neural Information Processing System (NeurIPS)","author":"Wu Xiaoyang","year":"2022","unstructured":"Xiaoyang Wu, Yixing Lao, Li Jiang, Xihui Liu, and Hengshuang Zhao. 2022. Point transformer V2: Grouped vector attention and partition-based pooling. In 36th International Conference on Neural Information Processing System (NeurIPS)."},{"key":"e_1_3_1_62_2","first-page":"12460","volume-title":"AAAI Conference on Artificial Intelligence","volume":"34","author":"Xie Liang","year":"2020","unstructured":"Liang Xie, Chao Xiang, Zhengxu Yu, Guodong Xu, Zheng Yang, Deng Cai, and Xiaofei He. 2020. PI-RCNN: An efficient multi-sensor 3D object detector with point-based attentive cont-conv fusion module. In AAAI Conference on Artificial Intelligence, 34 (2020), 12460\u201312467."},{"key":"e_1_3_1_63_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2020.3033117"},{"key":"e_1_3_1_64_2","first-page":"21179","volume-title":"IEEE\/CVF International Conference on Computer Vision","author":"Yang Qi","year":"2022","unstructured":"Qi Yang, Yipeng Liu, Siheng Chen, Yiling Xu, and Jun Sun. 2022. No-reference point cloud quality assessment via domain adaptation. In IEEE\/CVF International Conference on Computer Vision, 21179\u201321188."},{"issue":"6","key":"e_1_3_1_65_2","doi-asserted-by":"crossref","first-page":"3015","DOI":"10.1109\/TPAMI.2020.3047083","article-title":"Inferring point cloud quality via graph similarity","volume":"44","author":"Yang Qi","year":"2020","unstructured":"Qi Yang, Zhan Ma, Yiling Xu, Zhu Li, and Jun Sun. 2020. Inferring point cloud quality via graph similarity. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 6 (2020), 3015\u20133029.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"e_1_3_1_66_2","first-page":"720","volume-title":"European Conference on Computer Vision","author":"Yoo Jin Hyeok","year":"2020","unstructured":"Jin Hyeok Yoo, Yecheol Kim, Jisong Kim, and Jun Won Choi. 2020. 3d-cvf: Generating joint camera and lidar features using cross-view spatial feature fusion for 3d object detection. In European Conference on Computer Vision, 720\u2013736."},{"key":"e_1_3_1_67_2","first-page":"1","article-title":"Patch-based deep autoencoder for point cloud geometry compression","author":"You Kang","year":"2021","unstructured":"Kang You and Pan Gao. 2021. Patch-based deep autoencoder for point cloud geometry compression. In ACM Multimedia Asia, 1\u20137.","journal-title":"ACM Multimedia Asia"},{"key":"e_1_3_1_68_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2022.3143321"},{"key":"e_1_3_1_69_2","doi-asserted-by":"crossref","first-page":"943","DOI":"10.1145\/3503161.3548175","volume-title":"ACM International Conference on Multimedia","author":"Zhang Chaofan","year":"2022","unstructured":"Chaofan Zhang and Shiguang Liu. 2022. No-reference omnidirectional image quality assessment based on joint network. In ACM International Conference on Multimedia, 943\u2013951."},{"key":"e_1_3_1_70_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2015.2426416"},{"issue":"6","key":"e_1_3_1_71_2","doi-asserted-by":"crossref","first-page":"1266","DOI":"10.1109\/TNNLS.2015.2461603","article-title":"The application of visual saliency models in objective image quality assessment: A statistical evaluation","volume":"27","author":"Zhang Wei","year":"2015","unstructured":"Wei Zhang, Ali Borji, Zhou Wang, Patrick Le Callet, and Hantao Liu. 2015. The application of visual saliency models in objective image quality assessment: A statistical evaluation. IEEE Transactions on Neural Networks and Learning Systems 27, 6 (2015), 1266\u20131278.","journal-title":"IEEE Transactions on Neural Networks and Learning Systems"},{"key":"e_1_3_1_72_2","first-page":"908","volume-title":"IEEE\/CVF International Conference on Computer Vision","author":"Zhang Yanan","year":"2022","unstructured":"Yanan Zhang, Jiaxin Chen, and Di Huang. 2022. CAT-Det: Contrastively augmented transformer for multi-modal 3D object detection. In IEEE\/CVF International Conference on Computer Vision, 908\u2013917."},{"key":"e_1_3_1_73_2","volume-title":"IEEE International Conference on Multimedia and Expo (ICME)","author":"Zhang Zicheng","year":"2024","unstructured":"Zicheng Zhang, Yu Fan, Wei Sun, Xiongkuo Min, Xiaohong Liu, Chunyi Li, Haoning Wu, Weisi Lin, Ning Liu, and Guangtao Zhai. 2024. Optimizing projection-based point cloud quality assessment with human preferred viewpoints selection. IEEE International Conference on Multimedia and Expo (ICME) (2024)."},{"key":"e_1_3_1_74_2","doi-asserted-by":"crossref","first-page":"440","DOI":"10.1109\/ICMEW59549.2023.00082","volume-title":"2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","author":"Zhang Zicheng","year":"2023","unstructured":"Zicheng Zhang, Chunyi Li, Wei Sun, Xiaohong Liu, Xiongkuo Min, and Guangtao Zhai. 2023. A perceptual quality assessment exploration for AIGC images. In 2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW). IEEE, 440\u2013445."},{"key":"e_1_3_1_75_2","doi-asserted-by":"crossref","first-page":"4278","DOI":"10.1109\/ICIP46576.2022.9897249","volume-title":"2022 IEEE International Conference on Image Processing (ICIP)","author":"Zhang Zicheng","year":"2022","unstructured":"Zicheng Zhang, Wei Lu, Wei Sun, Xiongkuo Min, Tao Wang, and Guangtao Zhai. 2022. Surveillance video quality assessment based on quality related retraining. In 2022 IEEE International Conference on Image Processing (ICIP). IEEE, 4278\u20134282."},{"key":"e_1_3_1_76_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2022.3186894"},{"key":"e_1_3_1_77_2","first-page":"1","volume-title":"IEEE International Conference on Multimedia and Expo Workshop","author":"Zhang Zicheng","year":"2021","unstructured":"Zicheng Zhang, Wei Sun, Xiongkuo Min, Tao Wang, Wei Lu, Wenhan Zhu, and Guangtao Zhai. 2021. A no-reference visual quality metric for 3d color meshes. In IEEE International Conference on Multimedia and Expo Workshop. IEEE, 1\u20136."},{"key":"e_1_3_1_78_2","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2023\/195"},{"key":"e_1_3_1_79_2","volume-title":"IEEE International Conference on Multimedia and Expo","author":"Zhang Zicheng","year":"2021","unstructured":"Zicheng Zhang, Wei Sun, Xiongkuo Min, Wenhan Zhu, Tao Wang, Wei Lu, and Guangtao Zhai. 2021. A no-reference evaluation metric for low-light image enhancement. In IEEE International Conference on Multimedia and Expo."},{"key":"e_1_3_1_80_2","doi-asserted-by":"crossref","unstructured":"Zicheng Zhang Wei Sun Houning Wu Yingjie Zhou Chunyi Li Xiongkuo Min Guangtao Zhai and Weisi Lin. 2023. GMS-3DQA: Projection-based grid mini-patch sampling for 3D model quality assessment. arXiv:2306.05658. Retrieved from https:\/\/arxiv.org\/abs\/2306.05658","DOI":"10.1145\/3643817"},{"key":"e_1_3_1_81_2","doi-asserted-by":"crossref","unstructured":"Zicheng Zhang Wei Sun Yingjie Zhou Wei Lu Yucheng Zhu Xiongkuo Min and Guangtao Zhai. 2023. EEP-3DQA: Efficient and effective projection-based 3D model quality assessment. arXiv:2302.08715. Retrieved from https:\/\/arxiv.org\/abs\/2302.08715","DOI":"10.1109\/ICME55011.2023.00423"},{"key":"e_1_3_1_82_2","unstructured":"Zicheng Zhang Wei Sun Yingjie Zhou Haoning Wu Chunyi Li Xiongkuo Min and Xiaohong Liu. 2023. Advancing zero-shot digital human quality assessment through text-prompted evaluation. arXiv:2307.02808. Retrieved from https:\/\/arxiv.org\/abs\/2307.02808"},{"key":"e_1_3_1_83_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2023.3340894"},{"key":"e_1_3_1_84_2","unstructured":"Zicheng Zhang Haoning Wu Zhongpeng Ji Chunyi Li Erli Zhang Wei Sun Xiaohong Liu Xiongkuo Min Fengyu Sun Shangling Jui et al. 2023. Q-Boost: On visual quality assessment ability of low-level multi-modality foundation models. arXiv:2312.15300. Retrieved from https:\/\/arxiv.org\/abs\/2312.15300"},{"key":"e_1_3_1_85_2","unstructured":"Zicheng Zhang Haoning Wu Chunyi Li Yingjie Zhou Wei Sun Xiongkuo Min Zijian Chen Xiaohong Liu Weisi Lin and Guangtao Zhai. 2024. A-Bench: Are LMMs masters at evaluating AI-generated images? arXiv:2406.03070. Retrieved from https:\/\/arxiv.org\/abs\/2406.03070"},{"key":"e_1_3_1_86_2","doi-asserted-by":"crossref","first-page":"10404","DOI":"10.1109\/TPAMI.2024.3445770","article-title":"Q-bench: A benchmark for multi-modal foundation models on low-level vision from single images to pairs","volume":"6","author":"Zhang Zicheng","year":"2024","unstructured":"Zicheng Zhang, Haoning Wu, Erli Zhang, Guangtao Zhai, and Weisi Lin. 2024. Q-bench: A benchmark for multi-modal foundation models on low-level vision from single images to pairs. IEEE Transactions on Pattern Analysis and Machine Intelligence 6 (2024), 10404\u201310418.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"e_1_3_1_87_2","doi-asserted-by":"crossref","unstructured":"Zicheng Zhang Haoning Wu Yingjie Zhou Chunyi Li Wei Sun Chaofeng Chen Xiongkuo Min Xiaohong Liu Weisi Lin and Guangtao Zhai. 2024. LMM-PCQA: Assisting point cloud quality assessment with LMM. arXiv:2404.18203. Retrieved from https:\/\/arxiv.org\/abs\/2404.18203","DOI":"10.1145\/3664647.3680946"},{"key":"e_1_3_1_88_2","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","author":"Zhang Zicheng","year":"2023","unstructured":"Zicheng Zhang, Wei Wu, Wei Sun, Dangyang Tu, Wei Lu, Xiongkuo Min, Ying Chen, and Guangtao Zhai. 2023. MD-VQA: Multi-dimensional quality assessment for UGC live videos. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR)."},{"key":"e_1_3_1_89_2","doi-asserted-by":"crossref","unstructured":"Zicheng Zhang Yingjie Zhou Chunyi Li Baixuan Zhao Xiaohong Liu and Guangtao Zhai. 2024. Quality assessment in the era of large models: A survey. arXiv:2409.00031. Retrieved from https:\/\/arxiv.org\/abs\/2409.00031","DOI":"10.1145\/3722559"},{"key":"e_1_3_1_90_2","doi-asserted-by":"crossref","first-page":"2519","DOI":"10.1109\/ICME55011.2023.00429","volume-title":"2023 IEEE International Conference on Multimedia and Expo (ICME)","author":"Zhang Zicheng","year":"2023","unstructured":"Zicheng Zhang, Yingjie Zhou, Wei Sun, Wei Lu, Xiongkuo Min, Yu Wang, and Guangtao Zhai. 2023. Ddh-qa: A dynamic digital humans quality assessment database. In 2023 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 2519\u20132524."},{"key":"e_1_3_1_91_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP49357.2023.10095347"},{"key":"e_1_3_1_92_2","unstructured":"Zicheng Zhang Yingjie Zhou Wei Sun Xiongkuo Min and Guangtao Zhai. 2023. Simple baselines for projection-based full-reference and no-reference point cloud quality assessment. arXiv:2310.17147. Retrieved from https:\/\/arxiv.org\/abs\/2310.17147"},{"key":"e_1_3_1_93_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.01595"},{"key":"e_1_3_1_94_2","unstructured":"Qian-Yi Zhou Jaesik Park and Vladlen Koltun. 2018. Open3D: A modern library for 3D data processing. arXiv:1801.09847. Retrieved from https:\/\/arxiv.org\/abs\/1801.09847"},{"key":"e_1_3_1_95_2","unstructured":"Wei Zhou Qi Yang Qiuping Jiang Guangtao Zhai and Weisi Lin. 2022. Blind quality assessment of 3D dense point clouds with structure guided resampling. arXiv:2208.14603. Retrieved from https:\/\/arxiv.org\/abs\/2208.14603"},{"key":"e_1_3_1_96_2","doi-asserted-by":"crossref","unstructured":"Yingjie Zhou Zicheng Zhang Wei Sun Xiaohong Liu Xiongkuo Min Zhihua Wang Xiao-Ping Zhang and Guangtao Zhai. 2024. THQA: A perceptual quality assessment database for talking heads. arXiv:2404.09003. Retrieved from https:\/\/arxiv.org\/abs\/2404.09003","DOI":"10.1109\/ICIP51287.2024.10647507"},{"issue":"4","key":"e_1_3_1_97_2","first-page":"3","article-title":"Perceptual quality assessment for point clouds: A survey","volume":"21","author":"Zhou Yingjie","year":"2023","unstructured":"Yingjie Zhou, Zicheng Zhang, Wei Sun, Xiongkuo Min, and Guangtao Zhai. 2023. Perceptual quality assessment for point clouds: A survey. ZTE Communications 21, 4 (2023), 3\u201316.","journal-title":"ZTE Communications"},{"key":"e_1_3_1_98_2","unstructured":"Hanwei Zhu Haoning Wu Yixuan Li Zicheng Zhang Baoliang Chen Lingyu Zhu Yuming Fang Guangtao Zhai Weisi Lin and Shiqi Wang. 2024. Adaptive image quality assessment via teaching large multimodal model to compare. arXiv:2405.19298. Retrieved from https:\/\/arxiv.org\/abs\/2405.19298"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3715134","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3715134","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T01:18:18Z","timestamp":1750295898000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3715134"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,4,8]]},"references-count":97,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2025,4,30]]}},"alternative-id":["10.1145\/3715134"],"URL":"https:\/\/doi.org\/10.1145\/3715134","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,4,8]]},"assertion":[{"value":"2024-01-31","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-12-27","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-04-08","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}