{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,8]],"date-time":"2026-04-08T17:01:11Z","timestamp":1775667671082,"version":"3.50.1"},"reference-count":47,"publisher":"Cambridge University Press (CUP)","license":[{"start":{"date-parts":[[2023,3,23]],"date-time":"2023-03-23T00:00:00Z","timestamp":1679529600000},"content-version":"unspecified","delay-in-days":81,"URL":"https:\/\/www.cambridge.org\/core\/terms"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["No. 62207023"],"award-info":[{"award-number":["No. 62207023"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["cambridge.org"],"crossmark-restriction":true},"short-container-title":["AIEDAM"],"published-print":{"date-parts":[[2023]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>In the field of content generation by machine, the state-of-the-art text-to-image model, DALL\u22c5E, has advanced and diverse capacities for the combinational image generation with specific textual prompts. The images generated by DALL\u22c5E seem to exhibit an appreciable level of combinational creativity close to that of humans in terms of visualizing a combinational idea. Although there are several common metrics which can be applied to assess the quality of the images generated by generative models, such as IS, FID, GIQA, and CLIP, it is unclear whether these metrics are equally applicable to assessing images containing combinational creativity. In this study, we collected the generated image data from machine (DALL\u22c5E) and human designers, respectively. The results of group ranking in the Consensual Assessment Technique (CAT) and the Turing Test (TT) were used as the benchmarks to assess the combinational creativity. Considering the metrics\u2019 mathematical principles and different starting points in evaluating image quality, we introduced coincident rate (CR) and average rank variation (ARV) which are two comparable spaces. An experiment to calculate the consistency of group ranking of each metric by comparing the benchmarks then was conducted. By comparing the consistency results of CR and ARV on group ranking, we summarized the applicability of the existing evaluation metrics in assessing generative images containing combinational creativity. In the four metrics, GIQA performed the closest consistency to the CAT and TT. It shows the potential as an automated assessment for images containing combinational creativity, which can be used to evaluate the images containing combinational creativity in the relevant task of design and engineering such as conceptual sketch, digital design image, and prototyping image.<\/jats:p>","DOI":"10.1017\/s0890060423000069","type":"journal-article","created":{"date-parts":[[2023,3,23]],"date-time":"2023-03-23T08:19:51Z","timestamp":1679559591000},"update-policy":"https:\/\/doi.org\/10.1017\/policypage","source":"Crossref","is-referenced-by-count":16,"title":["A study of the evaluation metrics for generative images containing combinational creativity"],"prefix":"10.1017","volume":"37","author":[{"given":"Boheng","family":"Wang","sequence":"first","affiliation":[]},{"given":"Yunhuai","family":"Zhu","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9049-0394","authenticated-orcid":false,"given":"Liuqing","family":"Chen","sequence":"additional","affiliation":[]},{"given":"Jingcheng","family":"Liu","sequence":"additional","affiliation":[]},{"given":"Lingyun","family":"Sun","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2465-8822","authenticated-orcid":false,"given":"Peter","family":"Childs","sequence":"additional","affiliation":[]}],"member":"56","published-online":{"date-parts":[[2023,3,23]]},"reference":[{"key":"S0890060423000069_ref27","first-page":"10356","article-title":"Evaluation of coco validation 2017 dataset with yolov3","volume":"6","author":"Kim","year":"2019","journal-title":"Evaluation"},{"key":"S0890060423000069_ref12","doi-asserted-by":"publisher","DOI":"10.1016\/j.jvcir.2019.02.009"},{"key":"S0890060423000069_ref41","doi-asserted-by":"publisher","DOI":"10.1017\/9781108185936"},{"key":"S0890060423000069_ref47","unstructured":"Zhang, H , Yin, W , Fang, Y , Li, L , Duan, B , Wu, Z , \u2026 and Wang, H (2021) ERNIE-ViLG: unified generative pre-training for bidirectional vision-language generation. arXiv preprint arXiv:2112.15283."},{"key":"S0890060423000069_ref45","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511763205.008"},{"key":"S0890060423000069_ref17","doi-asserted-by":"crossref","first-page":"262","DOI":"10.1111\/j.2517-6161.1977.tb01624.x","article-title":"Spearman's footrule as a measure of disarray","volume":"39","author":"Diaconis","year":"1977","journal-title":"Journal of the Royal Statistical Society: Series B (Methodological)"},{"key":"S0890060423000069_ref31","unstructured":"Mansimov, E , Parisotto, E , Ba, JL and Salakhutdinov, R (2015) Generating images from captions with attention. arXiv preprint arXiv:1511.02793."},{"key":"S0890060423000069_ref46","doi-asserted-by":"publisher","DOI":"10.1007\/s00521-018-3849-7"},{"key":"S0890060423000069_ref11","first-page":"212","volume-title":"Handbook of Research on Creativity","author":"Burnard","year":"2013"},{"key":"S0890060423000069_ref21","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v32i1.11957"},{"key":"S0890060423000069_ref2","doi-asserted-by":"publisher","DOI":"10.1037\/0022-3514.43.5.997"},{"key":"S0890060423000069_ref23","first-page":"6626","article-title":"GANs trained by a two time-scale update rule converge to a local Nash equilibrium","volume":"30","author":"Heusel","year":"2017","journal-title":"Advances in Neural Information Processing Systems"},{"key":"S0890060423000069_ref5","doi-asserted-by":"publisher","DOI":"10.4324\/9780203508527"},{"key":"S0890060423000069_ref6","doi-asserted-by":"publisher","DOI":"10.1108\/03684921011036132"},{"key":"S0890060423000069_ref22","unstructured":"Han, J (2018) Combinational creativity and computational creativity."},{"key":"S0890060423000069_ref18","first-page":"19822","article-title":"CogView: mastering text-to-image generation via transformers","volume":"34","author":"Ding","year":"2021","journal-title":"Advances in Neural Information Processing Systems"},{"key":"S0890060423000069_ref42","doi-asserted-by":"publisher","DOI":"10.1002\/j.2162-6057.1972.tb00936.x"},{"key":"S0890060423000069_ref1","doi-asserted-by":"publisher","DOI":"10.1609\/aimag.v37i1.2643"},{"key":"S0890060423000069_ref15","doi-asserted-by":"publisher","DOI":"10.1002\/j.2162-6057.1975.tb00561.x"},{"key":"S0890060423000069_ref20","first-page":"23","volume-title":"GIQA: Generated Image Quality Assessment","author":"Gu","year":"2020"},{"key":"S0890060423000069_ref30","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"S0890060423000069_ref43","first-page":"433","article-title":"Computing machinery and intelligence-AM turing","volume":"59","author":"Turing","year":"2007","journal-title":"Mind"},{"key":"S0890060423000069_ref29","unstructured":"Liang, W , Zhang, Y , Kwon, Y , Yeung, S and Zou, J (2022) Mind the gap: understanding the modality gap in multi-modal contrastive representation learning. arXiv preprint arXiv:2203.02053."},{"key":"S0890060423000069_ref7","doi-asserted-by":"publisher","DOI":"10.1016\/j.cviu.2018.10.009"},{"key":"S0890060423000069_ref3","first-page":"347","article-title":"Consensual assessment","volume":"1","author":"Amabile","year":"1999","journal-title":"Encyclopedia of Creativity"},{"key":"S0890060423000069_ref16","first-page":"23","article-title":"Developing instrumentation for assessing creativity in engineering design","volume":"27","author":"Denson","year":"2015","journal-title":"Journal of Technology Education"},{"key":"S0890060423000069_ref36","unstructured":"Ramesh, A , Dhariwal, P , Nichol, A , Chu, C and Chen, M (2022) Hierarchical text-conditional image generation with CLIP latents. arXiv preprint arXiv:2204.06125."},{"key":"S0890060423000069_ref10","doi-asserted-by":"publisher","DOI":"10.21315\/mjms2018.25.6.9"},{"key":"S0890060423000069_ref35","unstructured":"Ramesh, A , Pavlov, M , Goh, G , Gray, S , Voss, C , Radford, A and Sutskever, I (2021) Zero-shot text-to-image generation. Paper presented at the Proceedings of the 38th International Conference on Machine Learning, Proceedings of Machine Learning Research. https:\/\/proceedings.mlr.press\/v139\/ramesh21a.html"},{"key":"S0890060423000069_ref37","first-page":"12247","article-title":"Classification accuracy score for conditional generative models","volume":"32","author":"Ravuri","year":"2019","journal-title":"Advances in Neural Information Processing Systems"},{"key":"S0890060423000069_ref8","doi-asserted-by":"publisher","DOI":"10.1007\/978-94-010-0105-2_12"},{"key":"S0890060423000069_ref24","doi-asserted-by":"publisher","DOI":"10.1080\/10400410802059929"},{"key":"S0890060423000069_ref26","doi-asserted-by":"publisher","DOI":"10.1080\/10400419.2010.481529"},{"key":"S0890060423000069_ref14","doi-asserted-by":"crossref","unstructured":"Cropley, DH and Kaufman, JC (2013) Rating the creativity of products. In Handbook of Research on Creativity. Edward Elgar Publishing.","DOI":"10.4337\/9780857939814.00025"},{"key":"S0890060423000069_ref13","unstructured":"Chu, H , Urtasun, R and Fidler, S (2016) Song from PI: a musically plausible network for pop music generation. arXiv preprint arXiv:1611.03477."},{"key":"S0890060423000069_ref19","doi-asserted-by":"publisher","DOI":"10.1016\/j.neunet.2021.07.019"},{"key":"S0890060423000069_ref4","unstructured":"Amato, G , Behrmann, M , Bimbot, F , Caramiaux, B , Falchi, F , Garcia, A , Geurts, J , Gibert, J , Gravier, G , Holken, H and Koenitz, H (2019) AI in the media and creative industries. arXiv preprint arXiv:1905.04175."},{"key":"S0890060423000069_ref25","volume-title":"Essentials of Creativity Assessment","author":"Kaufman","year":"2008"},{"key":"S0890060423000069_ref28","doi-asserted-by":"publisher","DOI":"10.1038\/35090055"},{"key":"S0890060423000069_ref9","first-page":"1877","article-title":"Language models are few-shot learners","volume":"33","author":"Brown","year":"2020","journal-title":"Advances in Neural Information Processing Systems"},{"key":"S0890060423000069_ref32","doi-asserted-by":"publisher","DOI":"10.1016\/0142-694X(89)90021-5"},{"key":"S0890060423000069_ref33","unstructured":"Pearce, MT and Wiggins, GA (2007) Evaluating cognitive models of musical composition. Paper presented at the Proceedings of the 4th International Joint Workshop on Computational Creativity."},{"key":"S0890060423000069_ref40","unstructured":"Shin, A , Crestel, L , Kato, H , Saito, K , Ohnishi, K , Yamaguchi, M and Harada, T (2017) Melody generation for pop music via word representation of musical properties. arXiv preprint arXiv:1710.11549."},{"key":"S0890060423000069_ref44","first-page":"6000","article-title":"Attention is all you need","volume":"30","author":"Vaswani","year":"2017","journal-title":"Advances in Neural Information Processing Systems"},{"key":"S0890060423000069_ref38","first-page":"2226","article-title":"Improved techniques for training GANs","volume":"29","author":"Salimans","year":"2016","journal-title":"Advances in Neural Information Processing Systems"},{"key":"S0890060423000069_ref34","unstructured":"Radford, A , Kim, JW , Hallacy, C , Ramesh, A , Goh, G , Agarwal, S , Sastry, G , Askell, A , Mishkin, P , Clark, J , Krueger, G and Sutskever, I (2021) Learning transferable visual models from natural language supervision. Paper presented at the Proceedings of the 38th International Conference on Machine Learning, Proceedings of Machine Learning Research. https:\/\/proceedings.mlr.press\/v139\/radford21a.html"},{"key":"S0890060423000069_ref39","doi-asserted-by":"publisher","DOI":"10.1016\/j.destud.2011.01.002"}],"container-title":["Artificial Intelligence for Engineering Design, Analysis and Manufacturing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.cambridge.org\/core\/services\/aop-cambridge-core\/content\/view\/S0890060423000069","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,16]],"date-time":"2024-10-16T22:19:27Z","timestamp":1729117167000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.cambridge.org\/core\/product\/identifier\/S0890060423000069\/type\/journal_article"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023]]},"references-count":47,"alternative-id":["S0890060423000069"],"URL":"https:\/\/doi.org\/10.1017\/s0890060423000069","relation":{},"ISSN":["0890-0604","1469-1760"],"issn-type":[{"value":"0890-0604","type":"print"},{"value":"1469-1760","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023]]},"assertion":[{"value":"Copyright \u00a9 The Author(s), 2023. Published by Cambridge University Press","name":"copyright","label":"Copyright","group":{"name":"copyright_and_licensing","label":"Copyright and Licensing"}}],"article-number":"e11"}}