{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,31]],"date-time":"2025-10-31T08:05:46Z","timestamp":1761897946095,"version":"3.41.0"},"reference-count":65,"publisher":"Association for Computing Machinery (ACM)","issue":"8","license":[{"start":{"date-parts":[[2024,6,29]],"date-time":"2024-06-29T00:00:00Z","timestamp":1719619200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"PIC4SeR"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2024,8,31]]},"abstract":"<jats:p>A recent research direction is focused on training Deep Neural Networks (DNNs) to replicate individual subject assessments of media quality. These DNNs are referred to as Artificial Intelligence-based Observers (AIOs). An AIO is designed to simulate, in real-time, the quality ratings of a specific individual, enabling an automatic quality assessment that accounts for subjects characteristics and preferences. Training AIOs is a promising but challenging research area due to the greater noise in individual raw opinion scores compared to the Mean Opinion Score. Effective learning from noisy labels necessitates the training of complex models on large-scale datasets. Unfortunately, this is challenging for AIOs as the media quality assessment community lacks extensive datasets that include individual opinion scores. To address the complexity of the task, we first created a dataset comprising two million samples, with synthetic labels derived from human annotation. We then trained a customized network for image quality assessment, named Multi-Distortion ResNet50 (MDResNet50), on this dataset. The weights of the MDResNet50 were subsequently utilized to initialize the learning process of each AIO, thereby avoiding the need to train a complex model from scratch on a small-scale dataset with raw individual opinion scores. Computational experiments show that our approach significantly advances the state-of-the-art in the AIO research. In particular: (i) we demonstrate through a simulation the ability of AIOs to mimic two well-known behavioral characteristics of a subject, i.e., bias and inconsistency, when scoring the media quality; (ii) we train and release DNN-based AIOs that, compared to the state-of-the-art, exhibit a higher performance with a statistical significance in assessing multiple image distortions; (iii) we train AIOs that more accurately mimic the sensitivity of real subjects to noise and color saturation and also better predict the opinion score distribution compared to the state-of-the-art AIOs.<\/jats:p>","DOI":"10.1145\/3664198","type":"journal-article","created":{"date-parts":[[2024,5,21]],"date-time":"2024-05-21T11:18:49Z","timestamp":1716290329000},"page":"1-27","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Multiple Image Distortion DNN Modeling Individual Subject Quality Assessment"],"prefix":"10.1145","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5127-9935","authenticated-orcid":false,"given":"Lohic","family":"Fotio Tiotsop","sequence":"first","affiliation":[{"name":"Control and Computer Engineering, Politecnico di Torino, Torino, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0159-5718","authenticated-orcid":false,"given":"Antonio","family":"Servetti","sequence":"additional","affiliation":[{"name":"Control and Computer Engineering, Politecnico di Torino, Torino, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6791-1325","authenticated-orcid":false,"given":"Peter","family":"Pocta","sequence":"additional","affiliation":[{"name":"Multimedia and Information-Communication Technology, Zilinska univerzita v Ziline, Zilina, Slovakia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9530-3466","authenticated-orcid":false,"given":"Glenn","family":"Van Wallendael","sequence":"additional","affiliation":[{"name":"Ghent University - imec, Ghent Belgium"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2739-3708","authenticated-orcid":false,"given":"Marcus","family":"Barkowsky","sequence":"additional","affiliation":[{"name":"Deggendorf Institute of Technology, Deggendorf, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8906-354X","authenticated-orcid":false,"given":"Enrico","family":"Masala","sequence":"additional","affiliation":[{"name":"Control and Computer Engineering, Politecnico di Torino, Torino Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2024,6,29]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2018.2868262"},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2017.2760518"},{"key":"e_1_3_2_4_2","unstructured":"Chaofeng Chen and Jiadi Mo. 2022. IQA-PyTorch: PyTorch Toolbox for Image Quality Assessment. Retrieved from https:\/\/github.com\/chaofengc\/IQA-PyTorch"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1109\/TBC.2011.2104671"},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2013.2292894"},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2022.3229839"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.1145\/3503161.3547872"},{"key":"e_1_3_2_9_2","unstructured":"Yixuan Gao Xiongkuo Min Yucheng Zhu Jing Li Xiao-Ping Zhang and Guangtao Zhai. 2022. Image quality assessment: From mean opinion score to opinion score distribution. Retrieved from https:\/\/github.com\/YixuanGao98\/Image-Quality-Assessment-From-Mean-Opinion-Score-to-Opinion-Score-Distribution"},{"key":"e_1_3_2_10_2","first-page":"249","article-title":"Understanding the difficulty of training deep feedforward neural networks","volume":"9","author":"Glorot Xavier","year":"2010","unstructured":"Xavier Glorot and Y. Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. J. Mach. Learn. Res. Proc. Track 9 (Jan. 2010), 249\u2013256.","journal-title":"J. Mach. Learn. Res. Proc. Track"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1007\/s41233-016-0002-1"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1109\/QoMEX.2011.6065690"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2020.2967829"},{"key":"e_1_3_2_15_2","unstructured":"ITU-T. Rec. BT.500. 2012. Methodology for the Subjective Assessment of the Quality of Television Pictures. https:\/\/www.itu.int\/rec\/R-REC-BT.500"},{"key":"e_1_3_2_16_2","unstructured":"ITU-T. Rec. P.910. 2008. Subjective Video Quality Assessment Methods for Multimedia Applications. https:\/\/www.itu.int\/rec\/T-REC-P.910"},{"key":"e_1_3_2_17_2","unstructured":"ITU-T. Rec. P.913. 2021. Methods for the Subjective Assessment of Video Quality Audio Quality and Audiovisual Quality of Internet Video and Distribution Quality Television in Any Environment. https:\/\/www.itu.int\/rec\/T-REC-P.913"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1109\/QOMEX.2009.5246979"},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2015.2484963"},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACSSC.2012.6489321"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2014.224"},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00836"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2019.2923051"},{"key":"e_1_3_2_24_2","first-page":"1097","volume-title":"Advances in Neural Information Processing Systems","author":"Krizhevsky Alex","year":"2012","unstructured":"Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. AIP, 1097\u20131105."},{"key":"e_1_3_2_25_2","unstructured":"Christopher Lennan Hao Nguyen and Dat Tran. 2018. Image Quality Assessment. Retrieved from https:\/\/github.com\/idealo\/image-quality-assessment"},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.1145\/3394171.3413619"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/DCC.2017.26"},{"issue":"11","key":"e_1_3_2_28_2","first-page":"131","article-title":"A simple model for subject behavior in subjective experiments","volume":"2020","author":"Li Zhi","year":"2020","unstructured":"Zhi Li, Christos G. Bampis, Lucjan Janowski, and Ioannis Katsavounidis. 2020. A simple model for subject behavior in subjective experiments. Electr. Imag. 2020, 11 (2020), 131\u20131.","journal-title":"Electr. Imag."},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.image.2019.115749"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.118"},{"key":"e_1_3_2_31_2","unstructured":"MATLAB. 2023. Pretrained Vision Transformer (ViT) Neural Network. Retrieved from https:\/\/it.mathworks.com\/help\/vision\/ref\/visiontransformer.html"},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMC.2013.155"},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2012.2214050"},{"key":"e_1_3_2_34_2","first-page":"1180","volume-title":"Proceedings of the 15th IEEE International Conference on Image Processing","author":"Ninassi A.","year":"2008","unstructured":"A. Ninassi, O. Le Meur, P. Le Callet, and D. Barba. 2008. Which semi-local visual masking model for wavelet-based image quality metric? In Proceedings of the 15th IEEE International Conference on Image Processing. IEEE, 1180\u20131183."},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.displa.2021.102075"},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-10368-6_2"},{"key":"e_1_3_2_37_2","first-page":"1","article-title":"Evidential deep learning to quantify classification uncertainty","volume":"31","author":"Sensoy Murat","year":"2018","unstructured":"Murat Sensoy, Lance Kaplan, and Melih Kandemir. 2018. Evidential deep learning to quantify classification uncertainty. Adv. Neural Info. Process. Syst. 31 (2018), 1\u201311.","journal-title":"Adv. Neural Info. Process. Syst."},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.1109\/QoMEX.2019.8743296"},{"key":"e_1_3_2_39_2","unstructured":"HR Sheikh. 2005. LIVE Image Quality Assessment Database Release 2. Retrieved from http:\/\/live.ece.utexas.edu\/research\/quality"},{"key":"e_1_3_2_40_2","unstructured":"H. R. Sheikh Z. Wang L. Cormack and A. C. Bovik. 2005. LIVE image quality assessment database. Retrieved from http:\/\/live.ece.utexas.edu\/research\/quality"},{"key":"e_1_3_2_41_2","unstructured":"Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. Retrieved from https:\/\/arXiv:1409.1556"},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2022.3152527"},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00530-014-0446-1"},{"key":"e_1_3_2_44_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00372"},{"key":"e_1_3_2_45_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2018.2831899"},{"key":"e_1_3_2_46_2","doi-asserted-by":"publisher","DOI":"10.1117\/1.1469618"},{"key":"e_1_3_2_47_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2021.3127395"},{"key":"e_1_3_2_48_2","doi-asserted-by":"publisher","DOI":"10.1109\/QoMEX.2019.8743303"},{"key":"e_1_3_2_49_2","doi-asserted-by":"publisher","DOI":"10.1145\/3464393"},{"key":"e_1_3_2_50_2","doi-asserted-by":"publisher","DOI":"10.1109\/MMSP53017.2021.9733456"},{"key":"e_1_3_2_51_2","first-page":"1","article-title":"Modeling and estimating the subjects\u2019 diversity of opinions in video quality assessment: a neural network based approach","volume":"80","author":"Tiotsop Lohic Fotio","year":"2020","unstructured":"Lohic Fotio Tiotsop, Tomas Mizdos, Miroslav Uhrina, Marcus Barkowsky, Peter Pocta, and Enrico Masala. 2020. Modeling and estimating the subjects\u2019 diversity of opinions in video quality assessment: a neural network based approach. Multimedia Tools Appl. 80 (2020), 1\u201319.","journal-title":"Multimedia Tools Appl."},{"key":"e_1_3_2_52_2","doi-asserted-by":"publisher","DOI":"10.2352\/ISSN.2470-1173.2020.11.HVEI-130"},{"key":"e_1_3_2_53_2","doi-asserted-by":"publisher","DOI":"10.1109\/QoMEX55416.2022.9900903"},{"key":"e_1_3_2_54_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.image.2022.116917"},{"key":"e_1_3_2_55_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP40776.2020.9053739"},{"key":"e_1_3_2_56_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICME.2018.8486528"},{"key":"e_1_3_2_57_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2019.8683359"},{"key":"e_1_3_2_58_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jvcir.2022.103676"},{"key":"e_1_3_2_59_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2003.819861"},{"key":"e_1_3_2_60_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2013.2292568"},{"key":"e_1_3_2_61_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW56347.2022.00126"},{"key":"e_1_3_2_62_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00363"},{"key":"e_1_3_2_63_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICIP42928.2021.9506075"},{"key":"e_1_3_2_64_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2018.2886771"},{"key":"e_1_3_2_65_2","doi-asserted-by":"crossref","unstructured":"Zicheng Zhang Haoning Wu Zhongpeng Ji Chunyi Li Erli Zhang Wei Sun Xiaohong Liu Xiongkuo Min Fengyu Sun Shangling Jui et\u00a0al. 2023. Q-Boost: On visual quality assessment ability of low-level multi-modality foundation models. Retrieved from https:\/\/arXiv:2312.15300","DOI":"10.1109\/ICMEW63481.2024.10645451"},{"key":"e_1_3_2_66_2","doi-asserted-by":"publisher","DOI":"10.1145\/3183512"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3664198","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3664198","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T18:44:02Z","timestamp":1750272242000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3664198"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,6,29]]},"references-count":65,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2024,8,31]]}},"alternative-id":["10.1145\/3664198"],"URL":"https:\/\/doi.org\/10.1145\/3664198","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"type":"print","value":"1551-6857"},{"type":"electronic","value":"1551-6865"}],"subject":[],"published":{"date-parts":[[2024,6,29]]},"assertion":[{"value":"2023-11-17","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-04-26","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-06-29","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}