{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,7]],"date-time":"2026-04-07T21:50:36Z","timestamp":1775598636583,"version":"3.50.1"},"reference-count":178,"publisher":"Springer Science and Business Media LLC","issue":"21","license":[{"start":{"date-parts":[[2024,8,14]],"date-time":"2024-08-14T00:00:00Z","timestamp":1723593600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,8,14]],"date-time":"2024-08-14T00:00:00Z","timestamp":1723593600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Multimed Tools Appl"],"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>In recent years, the study of artificial intelligence (AI) has undergone a paradigm shift. This has been propelled by the groundbreaking capabilities of generative models both in supervised and unsupervised learning scenarios. Generative AI has shown state-of-the-art performance in solving perplexing real-world conundrums in fields such as image translation, medical diagnostics, textual imagery fusion, natural language processing, and beyond. This paper documents the systematic review and analysis of recent advancements and techniques in Generative AI with a detailed discussion of their applications including application-specific models. Indeed, the major impact that generative AI has made to date, has been in language generation with the development of large language models, in the field of image translation and several other interdisciplinary applications of generative AI. Moreover, the primary contribution of this paper lies in its coherent synthesis of the latest advancements in these areas, seamlessly weaving together contemporary breakthroughs in the field. Particularly, how it shares an exploration of the future trajectory for generative AI. In conclusion, the paper ends with a discussion of Responsible AI principles, and the necessary ethical considerations for the sustainability and growth of these generative models.<\/jats:p>","DOI":"10.1007\/s11042-024-20016-1","type":"journal-article","created":{"date-parts":[[2024,8,14]],"date-time":"2024-08-14T08:02:01Z","timestamp":1723622521000},"page":"23661-23700","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":265,"title":["Generative artificial intelligence: a systematic review and applications"],"prefix":"10.1007","volume":"84","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2171-9332","authenticated-orcid":false,"given":"Sandeep Singh","family":"Sengar","sequence":"first","affiliation":[]},{"given":"Affan Bin","family":"Hasan","sequence":"additional","affiliation":[]},{"given":"Sanjay","family":"Kumar","sequence":"additional","affiliation":[]},{"given":"Fiona","family":"Carroll","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,8,14]]},"reference":[{"issue":"2","key":"20016_CR1","doi-asserted-by":"crossref","first-page":"223","DOI":"10.3390\/biomedicines10020223","volume":"10","author":"B Ahmad","year":"2022","unstructured":"Ahmad B, Sun J, You Q, Palade V, Mao Z (2022) Brain tumor classification using a combination of variational autoencoders and generative adversarial networks. Biomedicines 10(2):223","journal-title":"Biomedicines"},{"key":"20016_CR2","doi-asserted-by":"crossref","unstructured":"Ahuja K, Diddee H, Hada R, Ochieng M, Ramesh K, Jain P, Nambi A, Ganu T, Segal S, Axmed M, Bali K, Sitaram S (2023) Mega: Multilingual evaluation of generative ai","DOI":"10.18653\/v1\/2023.emnlp-main.258"},{"key":"20016_CR3","unstructured":"Akbik A, Blythe D, Vollgraf R (2018) Contextual string embeddings for sequence labeling. In: Proceedings of the 27th international conference on computational linguistics, pp 1638\u20131649"},{"key":"20016_CR4","doi-asserted-by":"crossref","first-page":"24205","DOI":"10.1109\/ACCESS.2018.2829199","volume":"6","author":"K Al-Sabahi","year":"2018","unstructured":"Al-Sabahi K, Zuping Z, Nadher M (2018) A hierarchical structured self-attentive model for extractive document summarization (hssas). IEEE Access 6:24205\u201324212","journal-title":"IEEE Access"},{"issue":"1","key":"20016_CR5","doi-asserted-by":"crossref","first-page":"98","DOI":"10.1186\/s13244-022-01237-0","volume":"13","author":"H Ali","year":"2022","unstructured":"Ali H, Biswas MR, Mohsen F, Shah U, Alamgir A, Mousa O, Shah Z (2022) The role of generative adversarial networks in brain mri: a scoping review. Insights Imaging 13(1):98","journal-title":"Insights Imaging"},{"issue":"3","key":"20016_CR6","doi-asserted-by":"crossref","first-page":"175","DOI":"10.1002\/stvr.354","volume":"16","author":"M Alshraideh","year":"2006","unstructured":"Alshraideh M, Bottaci L (2006) Search-based software test data generation for string data using program-specific search operators. Software Testing, Verification and Reliability 16(3):175\u2013203","journal-title":"Software Testing, Verification and Reliability"},{"key":"20016_CR7","unstructured":"Arjovsky M, Chintala S, Bottou L (2017a) Wasserstein gan"},{"key":"20016_CR8","unstructured":"Arjovsky M, Chintala S, Bottou L (2017b) Wasserstein generative adversarial networks. In: International conference on machine learning, pp 214\u2013223. PMLR"},{"key":"20016_CR9","doi-asserted-by":"crossref","unstructured":"Arnab A, Dehghani M, Heigold G, Sun C, Lu\u010di\u0107 M, Schmid C (2021) Vivit: A video vision transformer. In: Proceedings of the IEEE\/CVF international conference on computer vision, pp 6836\u20136846","DOI":"10.1109\/ICCV48922.2021.00676"},{"key":"20016_CR10","doi-asserted-by":"crossref","first-page":"82","DOI":"10.1016\/j.inffus.2019.12.012","volume":"58","author":"AB Arrieta","year":"2020","unstructured":"Arrieta AB, D\u00edaz-Rodr\u00edguez N, Del Ser J, Bennetot A, Tabik S, Barbado A, Garc\u00eda S, Gil-L\u00f3pez S, Molina D, Benjamins R et al (2020) Explainable artificial intelligence (xai): Concepts, taxonomies, opportunities and challenges toward responsible ai. Inform Fusion 58:82\u2013115","journal-title":"Inform Fusion"},{"key":"20016_CR11","doi-asserted-by":"crossref","unstructured":"Atapour-Abarghouei A, Breckon TP (2018) Real-time monocular depth estimation using synthetic data with domain adaptation via image style transfer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","DOI":"10.1109\/CVPR.2018.00296"},{"key":"20016_CR12","doi-asserted-by":"crossref","unstructured":"Balazevic I, Allen C, Hospedales T (2019) TuckER: Tensor factorization for knowledge graph completion. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics","DOI":"10.18653\/v1\/D19-1522"},{"key":"20016_CR13","doi-asserted-by":"crossref","unstructured":"Barsoum E, Kender J, Liu Z (2018) Hp-gan: Probabilistic 3d human motion prediction via gan. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) workshops","DOI":"10.1109\/CVPRW.2018.00191"},{"key":"20016_CR14","unstructured":"Bertasius G, Wang H, Torresani L (2021) Is space-time attention all you need for video understanding? In: ICML, vol 2, p 4"},{"key":"20016_CR15","unstructured":"Bi\u0144kowski M, Sutherland DJ, Arbel M, Gretton A (2018) Demystifying mmd gans. arXiv preprint arXiv:1801.01401"},{"key":"20016_CR16","unstructured":"Bozkurt A (2023) Generative artificial intelligence (ai) powered conversational educational agents: The inevitable paradigm shift. Asian J Dist Educ 18(1)"},{"key":"20016_CR17","first-page":"1877","volume":"33","author":"T Brown","year":"2020","unstructured":"Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A et al (2020) Language models are few-shot learners. Adv Neural Inf Process Syst 33:1877\u20131901","journal-title":"Adv Neural Inf Process Syst"},{"issue":"2","key":"20016_CR18","first-page":"1273","volume":"9","author":"C Cabanes","year":"2012","unstructured":"Cabanes C, Grouazel A, von Schuckmann K, Hamon M, Turpin V, Coatanoan C, Guinehut S, Boone C, Ferry N, Reverdin G et al (2012) The cora dataset: validation and diagnostics of ocean temperature and salinity in situ measurements. Ocean Sci Discuss 9(2):1273\u20131312","journal-title":"Ocean Sci Discuss"},{"key":"20016_CR19","doi-asserted-by":"crossref","unstructured":"Cabreza JN, Solano GA, Ojeda SA, Munar V (2022) Anomaly detection for alzheimer\u2019s disease in brain mris via unsupervised generative adversarial learning. In: 2022 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), pp 1\u20135","DOI":"10.1109\/ICAIIC54071.2022.9722678"},{"key":"20016_CR20","doi-asserted-by":"crossref","unstructured":"Cai L, Wang WY (2018) Kbgan: Adversarial learning for knowledge graph embeddings","DOI":"10.18653\/v1\/N18-1133"},{"key":"20016_CR21","unstructured":"Cao Y, Li S, Liu Y, Yan Z, Dai Y, Yu PS, Sun L (2023) A comprehensive survey of ai-generated content (aigc): A history of generative ai from gan to chatgpt"},{"key":"20016_CR22","doi-asserted-by":"crossref","unstructured":"Chandak A, Lee W, Stamp M (2021) A comparison of word2vec, hmm2vec, and pca2vec for malware classification","DOI":"10.1007\/978-3-030-62582-5_11"},{"key":"20016_CR23","unstructured":"Chen X, Duan Y, Houthooft R, Schulman J, Sutskever I, Abbeel P (2016) Infogan: Interpretable representation learning by information maximizing generative adversarial nets"},{"key":"20016_CR24","doi-asserted-by":"crossref","unstructured":"Cheong SY, Mustafa A, Gilbert A (2023) Upgpt: Universal diffusion model for person image generation, editing and pose transfer","DOI":"10.1109\/ICCVW60793.2023.00451"},{"key":"20016_CR25","doi-asserted-by":"crossref","unstructured":"Clark K, Luong M-T, Manning CD, Le QV (2018) Semi-supervised sequence modeling with cross-view training. arXiv preprint arXiv:1809.08370","DOI":"10.18653\/v1\/D18-1217"},{"key":"20016_CR26","doi-asserted-by":"crossref","unstructured":"Conneau A, Lample G, Rinott R, Williams A, Bowman SR, Schwenk H, Stoyanov V (2018) Xnli: Evaluating cross-lingual sentence representations. arXiv preprint arXiv:1809.05053","DOI":"10.18653\/v1\/D18-1269"},{"key":"20016_CR27","first-page":"273","volume":"20","author":"C Cortes","year":"1995","unstructured":"Cortes C, Vapnik V (1995) Support-vector networks. Machine learning 20:273\u2013297","journal-title":"Support-vector networks. Machine learning"},{"key":"20016_CR28","doi-asserted-by":"crossref","unstructured":"Courant R (1943) Variational methods for the solution of problems of equilibrium and vibrations","DOI":"10.1090\/S0002-9904-1943-07818-4"},{"issue":"1","key":"20016_CR29","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1109\/MSP.2017.2765202","volume":"35","author":"A Creswell","year":"2018","unstructured":"Creswell A, White T, Dumoulin V, Arulkumaran K, Sengupta B, Bharath AA (2018) Generative adversarial networks: An overview. IEEE Signal Process Mag 35(1):53\u201365","journal-title":"IEEE Signal Process Mag"},{"key":"20016_CR30","doi-asserted-by":"crossref","unstructured":"Dar SUH, Yurt M, Karacan L, Erdem A, Erdem E, \u00c7ukur T (2018) Image synthesis in multi-contrast mri with conditional generative adversarial networks","DOI":"10.1109\/TMI.2019.2901750"},{"key":"20016_CR31","unstructured":"Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805"},{"issue":"1","key":"20016_CR32","doi-asserted-by":"crossref","first-page":"188","DOI":"10.1186\/s13244-022-01315-3","volume":"13","author":"A Dimitriadis","year":"2022","unstructured":"Dimitriadis A, Trivizakis E, Papanikolaou N, Tsiknakis M, Marias K (2022) Enhancing cancer differentiation with synthetic mri examinations via generative models: a systematic review. Insights Imaging 13(1):188","journal-title":"Insights Imaging"},{"key":"20016_CR33","unstructured":"Dinh L, Krueger D, Bengio Y (2015) Nice: Non-linear independent components estimation"},{"key":"20016_CR34","unstructured":"Dinh L, Sohl-Dickstein J, Bengio S (2017) Density estimation using real nvp"},{"key":"20016_CR35","unstructured":"Donahue J, Kr\u00e4henb\u00fchl P, Darrell T (2016) Adversarial feature learning. arXiv preprint arXiv:1605.09782"},{"key":"20016_CR36","unstructured":"Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2020) An image is worth 16x16 words: Transformers for image recognition at scale. CoRR. arxiv:2010.11929"},{"key":"20016_CR37","first-page":"102642","volume":"71","author":"YK Dwivedi","year":"2023","unstructured":"Dwivedi YK, Kshetri N, Hughes L, Slade EL, Jeyaraj A, Kar AK, Baabdullah AM, Koohang A, Raghavan V, Ahuja M, Albanna H, Albashrawi MA, Al-Busaidi AS, Balakrishnan J, Barlette Y, Basu S, Bose I, Brooks L, Buhalis D, Carter L, Chowdhury S, Crick T, Cunningham SW, Davies GH, Davison RM, D\u00e9 R, Dennehy D, Duan Y, Dubey R, Dwivedi R, Edwards JS, Flavi\u00e1n C, Gauld R, Grover V, Hu M-C, Janssen M, Jones P, Junglas I, Khorana S, Kraus S, Larsen KR, Latreille P, Laumer S, Malik FT, Mardani A, Mariani M, Mithas S, Mogaji E, Nord JH, O\u2019Connor S, Okumus F, Pagani M, Pandey N, Papagiannidis S, Pappas IO, Pathak N, Pries-Heje J, Raman R, Rana NP, Rehm S-V, Ribeiro-Navarrete S, Richter A, Rowe F, Sarker S, Stahl BC, Tiwari MK, van der Aalst W, Venkatesh V, Viglia G, Wade M, Walton P, Wirtz J, Wright R (2023) Opinion paper: \u201cso what if chatgpt wrote it?\u2019\u2019 multidisciplinary perspectives on opportunities, challenges and implications of generative conversational ai for research, practice and policy. Int J Inf Manage 71:102642","journal-title":"Int J Inf Manage"},{"issue":"1","key":"20016_CR38","doi-asserted-by":"crossref","first-page":"24","DOI":"10.1038\/s41591-018-0316-z","volume":"25","author":"A Esteva","year":"2019","unstructured":"Esteva A, Robicquet A, Ramsundar B, Kuleshov V, DePristo M, Chou K, Cui C, Corrado G, Thrun S, Dean J (2019) A guide to deep learning in healthcare. Nat Med 25(1):24\u201329","journal-title":"Nat Med"},{"key":"20016_CR39","doi-asserted-by":"crossref","unstructured":"Fan H, Xiong B, Mangalam K, Li Y, Yan Z, Malik J (2021) Christoph feichtenhofer. multiscale vision transformers. In: Proceedings of the IEEE\/CVF international conference on computer vision, pp 6824\u20136835","DOI":"10.1109\/ICCV48922.2021.00675"},{"key":"20016_CR40","doi-asserted-by":"crossref","unstructured":"Feichtenhofer C (2020) X3d: Expanding architectures for efficient video recognition. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 203\u2013213","DOI":"10.1109\/CVPR42600.2020.00028"},{"key":"20016_CR41","doi-asserted-by":"crossref","unstructured":"Feichtenhofer C, Fan H, Malik J, He K (2019) Slowfast networks for video recognition. In: Proceedings of the IEEE\/CVF international conference on computer vision, pp 6202\u20136211","DOI":"10.1109\/ICCV.2019.00630"},{"key":"20016_CR42","doi-asserted-by":"crossref","unstructured":"Feng L, Li Q, Peng Z, Tan S, Zhou B (2023) Trafficgen: Learning to generate diverse and realistic traffic scenarios. In 2023 IEEE international conference on robotics and automation (ICRA), pp 3567\u20133575","DOI":"10.1109\/ICRA48891.2023.10160296"},{"key":"20016_CR43","doi-asserted-by":"crossref","unstructured":"Fontanini T, Ferrari C, Bertozzi M, Prati A (2023) Automatic generation of semantic parts for face image synthesis","DOI":"10.1007\/978-3-031-43148-7_18"},{"key":"20016_CR44","doi-asserted-by":"crossref","unstructured":"Frid-Adar M, Klang E, Amitai M, Goldberger J, Greenspan H (2018) Synthetic data augmentation using gan for improved liver lesion classification","DOI":"10.1109\/ISBI.2018.8363576"},{"key":"20016_CR45","doi-asserted-by":"crossref","unstructured":"Gan J, Wang W, Leng J, Gao X (2022) Higan+: Handwriting imitation gan with disentangled representations. ACM Trans Graph 42(1)","DOI":"10.1145\/3550070"},{"issue":"7","key":"20016_CR46","doi-asserted-by":"crossref","first-page":"4961","DOI":"10.1109\/TII.2020.2968370","volume":"16","author":"Y Gao","year":"2020","unstructured":"Gao Y, Liu X, Xiang J (2020) Fem simulation-based generative adversarial networks to detect bearing faults. IEEE Trans Industr Inf 16(7):4961\u20134971","journal-title":"IEEE Trans Industr Inf"},{"key":"20016_CR47","unstructured":"Golany T, Radinsky K, Freedman D (2020) SimGANs: Simulator-based generative adversarial networks for ECG synthesis to improve deep ECG classification. In: III HD, Singh A (eds) Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pp 3597\u20133606. PMLR"},{"key":"20016_CR48","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12874-020-00977-1","volume":"20","author":"A Goncalves","year":"2020","unstructured":"Goncalves A, Ray P, Soper B, Stevens J, Coyle L, Sales AP (2020) Generation and evaluation of synthetic patient data. BMC Med Res Methodol 20:1\u201340","journal-title":"BMC Med Res Methodol"},{"key":"20016_CR49","unstructured":"Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial networks"},{"key":"20016_CR50","unstructured":"Grathwohl W, Chen RT, Bettencourt J, Sutskever I, Duvenaud D (2018) Ffjord: Free-form continuous dynamics for scalable reversible generative models. arXiv preprint arXiv:1810.01367"},{"key":"20016_CR51","volume-title":"Advances in Neural Information Processing Systems","author":"M Heusel","year":"2017","unstructured":"Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in Neural Information Processing Systems, vol 30. Curran Associates Inc"},{"key":"20016_CR52","doi-asserted-by":"crossref","unstructured":"Hochreiter S, Schmidhuber J (1996) Lstm can solve hard long time lag problems. Adv Neural Inform Process Syst 9","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"20016_CR53","doi-asserted-by":"crossref","unstructured":"Holmes W, Bialik M, Fadel C (2023) Artificial intelligence in education. Globethics Publications","DOI":"10.58863\/20.500.12424\/4276068"},{"key":"20016_CR54","doi-asserted-by":"crossref","unstructured":"Hong F-T, Shen L, Xu D (2023) Dagan++: Depth-aware generative adversarial network for talking head video generation","DOI":"10.1109\/CVPR52688.2022.00339"},{"key":"20016_CR55","doi-asserted-by":"crossref","unstructured":"Hong F-T, Zhang L, Shen L, Xu D (2022) Depth-aware generative adversarial network for talking head video generation. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (CVPR), pp 3397\u20133406","DOI":"10.1109\/CVPR52688.2022.00339"},{"issue":"2","key":"20016_CR56","doi-asserted-by":"crossref","first-page":"108","DOI":"10.3390\/info11020108","volume":"11","author":"J Howard","year":"2020","unstructured":"Howard J, Gugger S (2020) Fastai: a layered api for deep learning. Information 11(2):108","journal-title":"Information"},{"key":"20016_CR57","doi-asserted-by":"crossref","unstructured":"Hoyez H, Schockaert C, Rambach J, Mirbach B, Stricker D (2022) Unsupervised image-to-image translation: A review. Sensors 22(21)","DOI":"10.3390\/s22218540"},{"key":"20016_CR58","doi-asserted-by":"crossref","unstructured":"Huang G-B, Zhu Q-Y, Siew C-K (2004) Extreme learning machine: a new learning scheme of feedforward neural networks. In: 2004 IEEE international joint conference on neural networks (IEEE Cat. No. 04CH37541), vol 2, pp 985\u2013990. Ieee","DOI":"10.1109\/IJCNN.2004.1380068"},{"key":"20016_CR59","doi-asserted-by":"crossref","unstructured":"Isola P, Zhu J-Y, Zhou T, Efros AA (2018) Image-to-image translation with conditional adversarial networks","DOI":"10.1109\/CVPR.2017.632"},{"key":"20016_CR60","first-page":"267","volume":"309","author":"V Jain","year":"2023","unstructured":"Jain V, Sengar SS, Ronickom JFA (2023) Age-specific diagnostic classification of asd using deep learning approaches. Stud Health Technol Inform 309:267\u2013271","journal-title":"Stud Health Technol Inform"},{"issue":"9","key":"20016_CR61","doi-asserted-by":"crossref","first-page":"389","DOI":"10.1038\/s42256-019-0088-2","volume":"1","author":"A Jobin","year":"2019","unstructured":"Jobin A, Ienca M, Vayena E (2019) The global landscape of ai ethics guidelines. Nature Mach Intell 1(9):389\u2013399","journal-title":"Nature Mach Intell"},{"issue":"1","key":"20016_CR62","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/sdata.2016.35","volume":"3","author":"AE Johnson","year":"2016","unstructured":"Johnson AE, Pollard TJ, Shen L, Lehman L-WH, Feng M, Ghassemi M, Moody B, Szolovits P, Anthony Celi L, Mark RG (2016) Mimic-iii, a freely accessible critical care database. Scientific data 3(1):1\u20139","journal-title":"Scientific data"},{"key":"20016_CR63","doi-asserted-by":"crossref","unstructured":"Joshi V, Peters M, Hopkins M (2018) Extending a parser to distant domains using a few dozen partially annotated examples","DOI":"10.18653\/v1\/P18-1110"},{"issue":"7873","key":"20016_CR64","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1038\/s41586-021-03819-2","volume":"596","author":"J Jumper","year":"2021","unstructured":"Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, Tunyasuvunakool K, Bates R, \u017d\u00eddek A, Potapenko A et al (2021) Highly accurate protein structure prediction with alphafold. Nature 596(7873):583\u2013589","journal-title":"Nature"},{"issue":"1","key":"20016_CR65","first-page":"1","volume":"19","author":"AS Kale","year":"2023","unstructured":"Kale AS, Pandya V, Di Troia F, Stamp M (2023) Malware classification with word2vec, hmm2vec, bert, and elmo. J Comput Virol Hacking Tech 19(1):1\u201316","journal-title":"J Comput Virol Hacking Tech"},{"key":"20016_CR66","doi-asserted-by":"crossref","unstructured":"Kang M, Zhu J-Y, Zhang R, Park J, Shechtman E, Paris S, Park T (2023) Scaling up gans for text-to-image synthesis. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (CVPR), pp 10124\u201310134","DOI":"10.1109\/CVPR52729.2023.00976"},{"key":"20016_CR67","doi-asserted-by":"crossref","unstructured":"Karras T, Laine S, Aittala M, Hellsten J, Lehtinen J, Aila T (2020) Analyzing and improving the image quality of stylegan. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 8110\u20138119","DOI":"10.1109\/CVPR42600.2020.00813"},{"key":"20016_CR68","unstructured":"Kay W, Carreira J, Simonyan K, Zhang B, Hillier C, Vijayanarasimhan S, Viola F, Green T, Back T, Natsev P, Suleyman M, Zisserman A (2017) The kinetics human action video dataset"},{"issue":"4","key":"20016_CR69","doi-asserted-by":"crossref","first-page":"5179","DOI":"10.1007\/s11042-021-11881-1","volume":"81","author":"G Keerti","year":"2022","unstructured":"Keerti G, Vaishnavi A, Mukherjee P, Vidya AS, Sreenithya GS, Nayab D (2022) Attentional networks for music generation. Multimed Tools Appl 81(4):5179\u20135189","journal-title":"Multimed Tools Appl"},{"key":"20016_CR70","unstructured":"Keskar NS, McCann B, Varshney LR, Xiong C, Socher R (2019) Ctrl: A conditional transformer language model for controllable generation"},{"key":"20016_CR71","doi-asserted-by":"crossref","first-page":"30399","DOI":"10.1007\/s11042-020-09607-w","volume":"80","author":"A Khamparia","year":"2021","unstructured":"Khamparia A, Gupta D, Rodrigues JJ, de Albuquerque VHC (2021) Dcavn: Cervical cancer prediction and classification using deep convolutional and variational autoencoder network. Multimed Tools Appl 80:30399\u201330415","journal-title":"Multimed Tools Appl"},{"key":"20016_CR72","unstructured":"Kingma DP, Dhariwal P (2018) Glow: Generative flow with invertible 1x1 convolutions. Adv Neural Inform Process Syst 31"},{"key":"20016_CR73","volume-title":"Adv Neural Inform Process Syst","author":"DP Kingma","year":"2016","unstructured":"Kingma DP, Salimans T, Jozefowicz R, Chen X, Sutskever I, Welling M (2016) Improved variational inference with inverse autoregressive flow. In: Lee D, Sugiyama M, Luxburg U, Guyon I, Garnett R (eds) Adv Neural Inform Process Syst, vol 29. Curran Associates Inc"},{"key":"20016_CR74","unstructured":"Kingma DP, Welling M (2013a) Auto-encoding variational bayes"},{"key":"20016_CR75","unstructured":"Kingma DP, Welling M (2013b) Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114"},{"key":"20016_CR76","unstructured":"Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks"},{"issue":"21","key":"20016_CR77","doi-asserted-by":"crossref","first-page":"32057","DOI":"10.1007\/s11042-023-14457-3","volume":"82","author":"S Kollem","year":"2023","unstructured":"Kollem S, Reddy KR, Rao DS (2023) A novel diffusivity function-based image denoising for mri medical images. Multimed Tools Appl 82(21):32057\u201332089","journal-title":"Multimed Tools Appl"},{"key":"20016_CR78","doi-asserted-by":"crossref","unstructured":"Kondratyuk D, Yuan L, Li Y, Zhang L, Tan M, Brown M, Gong B (2021) Movinets: Mobile video networks for efficient video recognition. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 16020\u201316030","DOI":"10.1109\/CVPR46437.2021.01576"},{"issue":"8","key":"20016_CR79","doi-asserted-by":"crossref","first-page":"5098","DOI":"10.3390\/app13085098","volume":"13","author":"H Ku","year":"2023","unstructured":"Ku H, Lee M (2023) Textcontrolgan: Text-to-image synthesis with controllable generative adversarial networks. Appl Sci 13(8):5098","journal-title":"Appl Sci"},{"issue":"26","key":"20016_CR80","doi-asserted-by":"crossref","first-page":"40585","DOI":"10.1007\/s11042-023-15138-x","volume":"82","author":"L Kumar","year":"2023","unstructured":"Kumar L, Singh DK (2023) A comprehensive survey on generative adversarial networks used for synthesizing multimedia content. Multimed Tools Appl 82(26):40585\u201340624","journal-title":"Multimed Tools Appl"},{"issue":"3","key":"20016_CR81","doi-asserted-by":"crossref","first-page":"3329","DOI":"10.1007\/s11227-022-04767-y","volume":"79","author":"S Kumar","year":"2023","unstructured":"Kumar S, Mallik A, Sengar SS (2023) Community detection in complex networks using stacked autoencoders and crow search algorithm. J Supercomput 79(3):3329\u20133356","journal-title":"J Supercomput"},{"key":"20016_CR82","doi-asserted-by":"crossref","unstructured":"Lakshmi PB, Reddy VD, Ghosh S, Sengar SS (2023) Classification of autism spectrum disorder based on brain image data using deep neural networks. In: International conference on frontiers of intelligent computing: theory and applications, pp 209\u2013218. Springer","DOI":"10.1007\/978-981-99-6702-5_17"},{"issue":"2","key":"20016_CR83","doi-asserted-by":"crossref","first-page":"167","DOI":"10.3233\/SW-140134","volume":"6","author":"J Lehmann","year":"2015","unstructured":"Lehmann J, Isele R, Jakob M, Jentzsch A, Kontokostas D, Mendes PN, Hellmann S, Morsey M, Van Kleef P, Auer S et al (2015) Dbpedia-a large-scale, multilingual knowledge base extracted from wikipedia. Semantic web 6(2):167\u2013195","journal-title":"Semantic web"},{"key":"20016_CR84","doi-asserted-by":"crossref","unstructured":"Lewis M, Liu Y, Goyal N, Ghazvininejad M, Mohamed A, Levy O, Stoyanov V, Zettlemoyer L (2019) Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension","DOI":"10.18653\/v1\/2020.acl-main.703"},{"key":"20016_CR85","doi-asserted-by":"crossref","unstructured":"Li Y, Wu C-Y, Fan H, Mangalam K, Xiong B, Malik J, Feichtenhofer C (2022) Mvitv2: Improved multiscale vision transformers for classification and detection. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (CVPR), pp 4804\u20134814","DOI":"10.1109\/CVPR52688.2022.00476"},{"key":"20016_CR86","unstructured":"Lightman H, Kosaraju V, Burda Y, Edwards H, Baker B, Lee T, Leike J, Schulman J, Sutskever I, Cobbe K (2023) Let\u2019s verify step by step"},{"key":"20016_CR87","unstructured":"Lin C-Y (2004) ROUGE: A package for automatic evaluation of summaries. In: Text Summarization Branches Out, Barcelona, Spain. Association for Computational Linguistics, pp 74\u201381"},{"key":"20016_CR88","doi-asserted-by":"crossref","unstructured":"Lin Y, Wang Y, Li Y, Gao Y, Wang Z, Khan L (2021) Attention-based spatial guidance for image-to-image translation. In: Proceedings of the IEEE\/CVF winter conference on applications of computer vision (WACV), pp 816\u2013825","DOI":"10.1109\/WACV48630.2021.00086"},{"issue":"12","key":"20016_CR89","doi-asserted-by":"crossref","first-page":"10227","DOI":"10.1109\/TGRS.2020.3042974","volume":"59","author":"Q Liu","year":"2020","unstructured":"Liu Q, Zhou H, Xu Q, Liu X, Wang Y (2020) Psgan: A generative adversarial network for remote sensing image pan-sharpening. IEEE Trans Geosci Remote Sens 59(12):10227\u201310242","journal-title":"IEEE Trans Geosci Remote Sens"},{"key":"20016_CR90","doi-asserted-by":"crossref","unstructured":"Liu W, Zhou P, Zhao Z, Wang Z, Ju Q, Deng H, Wang P (2019) K-bert: Enabling language representation with knowledge graph","DOI":"10.1609\/aaai.v34i03.5681"},{"key":"20016_CR91","doi-asserted-by":"crossref","unstructured":"Liu Z, Luo P, Wang X, Tang X (2015) Deep learning face attributes in the wild. In: Proceedings of international conference on computer vision (ICCV)","DOI":"10.1109\/ICCV.2015.425"},{"key":"20016_CR92","doi-asserted-by":"crossref","unstructured":"Liu Z, Ning J, Cao Y, Wei Y, Zhang Z, Lin S, Hu H (2022) Video swin transformer. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 3202\u20133211","DOI":"10.1109\/CVPR52688.2022.00320"},{"key":"20016_CR93","unstructured":"Luckin R, Holmes W (2016) Intelligence unleashed: An argument for ai in education"},{"key":"20016_CR94","unstructured":"Madadkhani S, Ramos OM, Chapman M, Dunietz J, Ouaknine A, Rolnick D, Bengio Y (2024) Tackling climate change with machine learning: Fostering the maturity of ml applications for climate change. In: ICLR 2024 Workshops"},{"key":"20016_CR95","doi-asserted-by":"crossref","unstructured":"Mao X, Li Q, Xie H, Lau RY, Wang Z, Paul\u00a0Smolley S (2017) Least squares generative adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2794\u20132802","DOI":"10.1109\/ICCV.2017.304"},{"key":"20016_CR96","doi-asserted-by":"crossref","unstructured":"Masci J, Meier U, Cire\u015fan D, Schmidhuber J (2011) Stacked convolutional auto-encoders for hierarchical feature extraction. In: Artificial Neural Networks and Machine Learning\u2013ICANN 2011: 21st International Conference on Artificial Neural Networks, Espoo, Finland, June 14-17, 2011, Proceedings, Part I 21, pages 52\u201359. Springer","DOI":"10.1007\/978-3-642-21735-7_7"},{"key":"20016_CR97","unstructured":"McKeown K, Barzilay R, Blair-Goldensohn S, Evans D, Hatzivassiloglou V, Klavans J, Nenkova A, Schiffman B, Sigelman S (2002) The columbia multi-document summarizer for duc 2002. In: Workshop on Automatic Summarization, pp 1\u20138"},{"issue":"6","key":"20016_CR98","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3457607","volume":"54","author":"N Mehrabi","year":"2021","unstructured":"Mehrabi N, Morstatter F, Saxena N, Lerman K, Galstyan A (2021) A survey on bias and fairness in machine learning. ACM Comput Surv (CSUR) 54(6):1\u201335","journal-title":"ACM Comput Surv (CSUR)"},{"issue":"10","key":"20016_CR99","doi-asserted-by":"crossref","first-page":"1993","DOI":"10.1109\/TMI.2014.2377694","volume":"34","author":"BH Menze","year":"2014","unstructured":"Menze BH, Jakab A, Bauer S, Kalpathy-Cramer J, Farahani K, Kirby J, Burren Y, Porz N, Slotboom J, Wiest R et al (2014) The multimodal brain tumor image segmentation benchmark (brats). IEEE Trans Med Imaging 34(10):1993\u20132024","journal-title":"IEEE Trans Med Imaging"},{"key":"20016_CR100","unstructured":"Mescheder L, Geiger A, Nowozin S (2018a) Which training methods for gans do actually converge?"},{"key":"20016_CR101","unstructured":"Mescheder L, Nowozin S, Geiger A (2018b) The numerics of gans"},{"key":"20016_CR102","unstructured":"Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781"},{"key":"20016_CR103","unstructured":"Min D, Song M, Hwang SJ (2022) Styletalker: One-shot style-based audio-driven talking head video generation"},{"key":"20016_CR104","unstructured":"Mirza M, Osindero S (2014) Conditional generative adversarial nets"},{"key":"20016_CR105","doi-asserted-by":"crossref","first-page":"111734","DOI":"10.1016\/j.jss.2023.111734","volume":"203","author":"A Moradi Dakhel","year":"2023","unstructured":"Moradi Dakhel A, Majdinasab V, Nikanjam A, Khomh F, Desmarais MC, Jiang ZMJ (2023) Github copilot ai pair programmer: Asset or liability? J Syst Softw 203:111734","journal-title":"J Syst Softw"},{"key":"20016_CR106","volume-title":"Advances in Neural Information Processing Systems","author":"V Nagarajan","year":"2017","unstructured":"Nagarajan V, Kolter JZ (2017) Gradient descent gan optimization is locally stable. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in Neural Information Processing Systems, vol 30. Curran Associates Inc"},{"key":"20016_CR107","doi-asserted-by":"crossref","unstructured":"Nagrani A, Chung JS, Zisserman A (2017) VoxCeleb: A large-scale speaker identification dataset. In: Interspeech 2017. ISCA","DOI":"10.21437\/Interspeech.2017-950"},{"key":"20016_CR108","unstructured":"Nakano R, Hilton J, Balaji S, Wu J, Ouyang L, Kim C, Hesse C, Jain S, Kosaraju V, Saunders W, Jiang X, Cobbe K, Eloundou T, Krueger G, Button K, Knight M, Chess B, Schulman J (2022) Webgpt: Browser-assisted question-answering with human feedback"},{"key":"20016_CR109","doi-asserted-by":"crossref","unstructured":"Neimark D, Bar O, Zohar M, Asselmann D (2021) Video transformer network","DOI":"10.1109\/ICCVW54120.2021.00355"},{"key":"20016_CR110","unstructured":"Odena A (2016) Semi-supervised learning with generative adversarial networks"},{"key":"20016_CR111","unstructured":"Odena A, Olah C, Shlens J (2017) Conditional image synthesis with auxiliary classifier gans"},{"key":"20016_CR112","unstructured":"OpenAI (2023) Gpt-4 technical report"},{"key":"20016_CR113","doi-asserted-by":"crossref","unstructured":"Panayotov V, Chen G, Povey D, Khudanpur S (2015) Librispeech: An asr corpus based on public domain audio books. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 5206\u20135210","DOI":"10.1109\/ICASSP.2015.7178964"},{"key":"20016_CR114","doi-asserted-by":"crossref","unstructured":"Paola ZL, Jes\u00fas LS, Christian AH, Sonia RU (2023) Correction of banding errors in satellite images with generative adversarial networks (gan). IEEE Access","DOI":"10.1109\/ACCESS.2023.3279265"},{"key":"20016_CR115","doi-asserted-by":"crossref","unstructured":"Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations","DOI":"10.18653\/v1\/N18-1202"},{"key":"20016_CR116","doi-asserted-by":"crossref","unstructured":"Prajwal K, Mukhopadhyay R, Namboodiri VP, Jawahar C (2020) A lip sync expert is all you need for speech to lip generation in the wild. In: Proceedings of the 28th ACM international conference on multimedia, pp 484\u2013492","DOI":"10.1145\/3394171.3413532"},{"key":"20016_CR117","unstructured":"Pudari R, Ernst NA (2023) From copilot to pilot: Towards ai supported software development"},{"key":"20016_CR118","doi-asserted-by":"crossref","unstructured":"Qi G-J (2018) Loss-sensitive generative adversarial networks on lipschitz densities","DOI":"10.1007\/s11263-019-01265-2"},{"issue":"10","key":"20016_CR119","doi-asserted-by":"crossref","first-page":"1872","DOI":"10.1007\/s11431-020-1647-3","volume":"63","author":"X Qiu","year":"2020","unstructured":"Qiu X, Sun T, Xu Y, Shao Y, Dai N, Huang X (2020) Pre-trained models for natural language processing: A survey. SCIENCE CHINA Technol Sci 63(10):1872\u20131897","journal-title":"SCIENCE CHINA Technol Sci"},{"key":"20016_CR120","first-page":"81","volume":"1","author":"JR Quinlan","year":"1986","unstructured":"Quinlan JR (1986) Induction of decision trees. Machine learning 1:81\u2013106","journal-title":"Machine learning"},{"key":"20016_CR121","unstructured":"Radford A, Kim JW, Hallacy C, Ramesh A, Goh G, Agarwal S, Sastry G, Askell A, Mishkin P, Clark J, Krueger G, Sutskever I (2021) Learning transferable visual models from natural language supervision"},{"key":"20016_CR122","unstructured":"Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434"},{"key":"20016_CR123","unstructured":"Radford A, Narasimhan K, Salimans T, Sutskever I et\u00a0al (2018) Improving language understanding by generative pre-training"},{"issue":"8","key":"20016_CR124","first-page":"9","volume":"1","author":"A Radford","year":"2019","unstructured":"Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I et al (2019) Language models are unsupervised multitask learners. OpenAI blog 1(8):9","journal-title":"OpenAI blog"},{"issue":"1","key":"20016_CR125","first-page":"5485","volume":"21","author":"C Raffel","year":"2020","unstructured":"Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu PJ (2020) Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 21(1):5485\u20135551","journal-title":"J Mach Learn Res"},{"key":"20016_CR126","doi-asserted-by":"crossref","first-page":"3275","DOI":"10.1007\/s11042-020-09549-3","volume":"80","author":"R Rani","year":"2021","unstructured":"Rani R, Lobiyal D (2021) An extractive text summarization approach using tagged-lda based topic modeling. Multimed Tools Appl 80:3275\u20133305","journal-title":"Multimed Tools Appl"},{"issue":"14","key":"20016_CR127","first-page":"71","volume":"8","author":"MDM Reddy","year":"2021","unstructured":"Reddy MDM, Basha MSM, Hari MMC, Penchalaiah MN (2021) Dall-e: Creating images from text. UGC Care Group I Journal 8(14):71\u201375","journal-title":"UGC Care Group I Journal"},{"key":"20016_CR128","doi-asserted-by":"crossref","unstructured":"Rezagholiradeh M, Haidar MA (2018) Reg-gan: Semi-supervised learning based on generative adversarial networks for regression. In: 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 2806\u20132810. IEEE","DOI":"10.1109\/ICASSP.2018.8462534"},{"key":"20016_CR129","volume-title":"Advances in Neural Information Processing Systems","author":"T Salimans","year":"2016","unstructured":"Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X, Chen X (2016) Improved techniques for training gans. In: Lee D, Sugiyama M, Luxburg U, Guyon I, Garnett R (eds) Advances in Neural Information Processing Systems, vol 29. Curran Associates Inc"},{"key":"20016_CR130","unstructured":"Sang EF, De\u00a0Meulder F (2003) Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs\/0306050"},{"key":"20016_CR131","doi-asserted-by":"crossref","unstructured":"Sengar SS, Kumar S (2022) Content-based secure image retrieval in an untrusted third-party environment. In: International conference on frontiers of intelligent computing: theory and applications, pp 287\u2013297. Springer","DOI":"10.1007\/978-981-19-7513-4_26"},{"issue":"3","key":"20016_CR132","doi-asserted-by":"crossref","first-page":"985","DOI":"10.1002\/ima.22836","volume":"33","author":"SS Sengar","year":"2023","unstructured":"Sengar SS, Meulengracht C, Boesen MP, Overgaard AF, Gudbergsen H, Nybing JD, Perslev M, Dam EB (2023) Multi-planar 3d knee mri segmentation via unet inspired architectures. Int J Imaging Syst Technol 33(3):985\u2013998","journal-title":"Int J Imaging Syst Technol"},{"key":"20016_CR133","doi-asserted-by":"crossref","unstructured":"Sengar SS, Mukhopadhyay S (2016) Moving object tracking using laplacian-dct based perceptual hash. In: 2016 International conference on wireless communications, signal processing and networking (WiSPNET), pp 2345\u20132349. IEEE","DOI":"10.1109\/WiSPNET.2016.7566561"},{"issue":"15","key":"20016_CR134","doi-asserted-by":"crossref","first-page":"11443","DOI":"10.1007\/s00521-019-04635-6","volume":"32","author":"SS Sengar","year":"2020","unstructured":"Sengar SS, Mukhopadhyay S (2020) Motion segmentation-based surveillance video compression using adaptive particle swarm optimization. Neural Comput Appl 32(15):11443\u201311457","journal-title":"Neural Comput Appl"},{"key":"20016_CR135","doi-asserted-by":"crossref","first-page":"8355","DOI":"10.1007\/s11042-020-09885-4","volume":"80","author":"X Shi","year":"2021","unstructured":"Shi X, Lv F, Seng D, Zhang J, Chen J, Xing B (2021) Visualizing and understanding graph convolutional network. Multimed Tools Appl 80:8355\u20138375","journal-title":"Multimed Tools Appl"},{"key":"20016_CR136","unstructured":"Singhal A (2012) Introducing the knowledge graph: Things, not strings,"},{"key":"20016_CR137","unstructured":"Soomro K, Zamir AR, Shah M (2012) Ucf101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402"},{"key":"20016_CR138","unstructured":"Steiner T, Verborgh R, Troncy R, Gabarro J, Van\u00a0de Walle R (2012) Adding realtime coverage to the google knowledge graph. In: 11th International Semantic Web Conference (ISWC 2012), vol 914, pp 65\u201368. Citeseer"},{"key":"20016_CR139","volume-title":"Advances in Neural Information Processing Systems","author":"I Sutskever","year":"2014","unstructured":"Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Ghahramani Z, Welling M, Cortes C, Lawrence N, Weinberger K (eds) Advances in Neural Information Processing Systems, vol 27. Curran Associates Inc"},{"key":"20016_CR140","doi-asserted-by":"crossref","unstructured":"Tahir R, Cheng K, Memon BA, Liu Q (2022) A diverse domain generative adversarial network for style transfer on face photographs","DOI":"10.9781\/ijimai.2022.08.001"},{"key":"20016_CR141","doi-asserted-by":"crossref","unstructured":"Tan S, Wong K, Wang S, Manivasagam S, Ren M, Urtasun R (2021) Scenegen: Learning to generate realistic traffic scenes. In: Proceedings - 2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 892\u2013901. IEEE Computer Society. Funding Information: Work done at Uber ATG. Publisher Copyright: $${\\copyright }$$ 2021 IEEE; 2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021 ; Conference date: 19-06-2021 Through 25-06-2021","DOI":"10.1109\/CVPR46437.2021.00095"},{"issue":"5","key":"20016_CR142","doi-asserted-by":"crossref","first-page":"874","DOI":"10.1016\/j.jvcir.2014.01.008","volume":"25","author":"A Tanchenko","year":"2014","unstructured":"Tanchenko A (2014) Visual-psnr measure of image quality. J Vis Commun Image Represent 25(5):874\u2013878","journal-title":"J Vis Commun Image Represent"},{"key":"20016_CR143","doi-asserted-by":"crossref","unstructured":"Tibrewala R, Dutt T, Tong A, Ginocchio L, Keerthivasan MB, Baete SH, Chopra S, Lui YW, Sodickson DK, Chandarana H, Johnson PM (2023) Fastmri prostate: A publicly available, biparametric mri dataset to advance machine learning for prostate cancer imaging","DOI":"10.1038\/s41597-024-03252-w"},{"issue":"1","key":"20016_CR144","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1186\/s40561-023-00237-x","volume":"10","author":"A Tlili","year":"2023","unstructured":"Tlili A, Shehata B, Adarkwah MA, Bozkurt A, Hickey DT, Huang R, Agyemang B (2023) What if the devil is my guardian angel: Chatgpt as a case study of using chatbots in education. Smart Learning Environments 10(1):15","journal-title":"Smart Learning Environments"},{"issue":"1","key":"20016_CR145","doi-asserted-by":"crossref","first-page":"44","DOI":"10.1038\/s41591-018-0300-7","volume":"25","author":"EJ Topol","year":"2019","unstructured":"Topol EJ (2019) High-performance medicine: the convergence of human and artificial intelligence. Nat Med 25(1):44\u201356","journal-title":"Nat Med"},{"key":"20016_CR146","doi-asserted-by":"crossref","unstructured":"Torbunov D, Huang Y, Yu H, Huang J, Yoo S, Lin M, Viren B, Ren Y (2023) Uvcgan: Unet vision transformer cycle-consistent gan for unpaired image-to-image translation. In: Proceedings of the IEEE\/CVF winter conference on applications of computer vision, pp 702\u2013712","DOI":"10.1109\/WACV56688.2023.00077"},{"key":"20016_CR147","unstructured":"van\u00a0den Oord A, Dieleman S, Zen H, Simonyan K, Vinyals O, Graves A, Kalchbrenner N, Senior A, Kavukcuoglu K (2016) Wavenet: A generative model for raw audio"},{"key":"20016_CR148","doi-asserted-by":"crossref","unstructured":"Vasanthi P, Mohan L (2023) Multi-head-self-attention based yolov5x-transformer for multi-scale object detection. Multimed Tools Appl pp 1\u201327","DOI":"10.1007\/s11042-023-15773-4"},{"key":"20016_CR149","unstructured":"Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L u, Polosukhin I (2017) Attention is all you need. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in Neural Information Processing Systems, vol 30. Curran Associates Inc"},{"key":"20016_CR150","unstructured":"Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol P-A, Bottou L (2010) Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 11(12)"},{"key":"20016_CR151","doi-asserted-by":"crossref","unstructured":"Wang P, Zhang C, Qi F, Liu S, Zhang X, Lyu P, Han J, Liu J, Ding E, Shi G (2021) Pgnet: Real-time arbitrarily-shaped text spotting with point gathering network. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 35, pp 2782\u20132790","DOI":"10.1609\/aaai.v35i4.16383"},{"key":"20016_CR152","doi-asserted-by":"crossref","unstructured":"Wang Q, Gao J, Lin W, Yuan Y (2019) Learning from synthetic data for crowd counting in the wild","DOI":"10.1109\/CVPR.2019.00839"},{"key":"20016_CR153","doi-asserted-by":"crossref","unstructured":"Wang S, Li L, Ding Y, Fan C, Yu X (2021b) Audio2head: Audio-driven one-shot talking-head generation with natural head motion. arXiv preprint arXiv:2107.09293","DOI":"10.24963\/ijcai.2021\/152"},{"key":"20016_CR154","doi-asserted-by":"crossref","unstructured":"Wang X, Peng Y, Lu L, Lu Z, Bagheri M, Summers RM (2017) Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2097\u20132106","DOI":"10.1109\/CVPR.2017.369"},{"issue":"4","key":"20016_CR155","doi-asserted-by":"crossref","first-page":"600","DOI":"10.1109\/TIP.2003.819861","volume":"13","author":"Z Wang","year":"2004","unstructured":"Wang Z, Bovik A, Sheikh H, Simoncelli E (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600\u2013612","journal-title":"IEEE Trans Image Process"},{"issue":"10","key":"20016_CR156","doi-asserted-by":"crossref","first-page":"2547","DOI":"10.3390\/rs15102547","volume":"15","author":"J Wei","year":"2023","unstructured":"Wei J, Zou H, Sun L, Cao X, He S, Liu S, Zhang Y (2023) Cfrwd-gan for sar-to-optical image translation. Remote Sens 15(10):2547","journal-title":"Remote Sens"},{"key":"20016_CR157","doi-asserted-by":"crossref","unstructured":"Wu W, Zhang Y, Li C, Qian C, Loy CC (2018) Reenactgan: Learning to reenact faces via boundary transfer","DOI":"10.1007\/978-3-030-01246-5_37"},{"key":"20016_CR158","doi-asserted-by":"crossref","unstructured":"Xiao S, Duan L, Xie G, Li R, Chen Z, Deng G, Nummenmaa J (2021) Hmnet: Hybrid matching network for few-shot link prediction. In: International conference on database systems for advanced applications, pp 307\u2013322. Springer","DOI":"10.1007\/978-3-030-73194-6_21"},{"issue":"3","key":"20016_CR159","doi-asserted-by":"crossref","first-page":"547","DOI":"10.3390\/jpm13030547","volume":"13","author":"IR Xu","year":"2023","unstructured":"Xu IR, Van Booven DJ, Goberdhan S, Breto A, Porto J, Alhusseini M, Algohary A, Stoyanova R, Punnen S, Mahne A et al (2023) Generative adversarial networks can create high quality artificial prostate cancer magnetic resonance images. J Personalized Med 13(3):547","journal-title":"J Personalized Med"},{"key":"20016_CR160","doi-asserted-by":"crossref","unstructured":"Xu Y, Li M, Cui L, Huang S, Wei F, Zhou M (2020) Layoutlm: Pre-training of text and layout for document image understanding. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pp 1192\u20131200","DOI":"10.1145\/3394486.3403172"},{"key":"20016_CR161","volume":"12","author":"S Yan","year":"2022","unstructured":"Yan S, Wang C, Chen W, Lyu J (2022) Swin transformer-based gan for multi-modal medical image translation. Front Oncol 12:942511","journal-title":"Front Oncol"},{"key":"20016_CR162","unstructured":"Yang K, Yau J, Fei-Fei L, Deng J, Russakovsky O (2022) A study of face obfuscation in imagenet. In: International conference on machine learning (ICML)"},{"key":"20016_CR163","unstructured":"Yang X, Li Y, Zhang X, Chen H, Cheng W (2023) Exploring the limits of chatgpt for query or aspect-based text summarization"},{"key":"20016_CR164","unstructured":"Yeh R, Liu Z, Goldman DB, Agarwala A (2016) Semantic facial expression editing using autoencoded flow"},{"key":"20016_CR165","doi-asserted-by":"crossref","unstructured":"Yu L, Zhang W, Wang J, Yu Y (2017) Seqgan: Sequence generative adversarial nets with policy gradient","DOI":"10.1609\/aaai.v31i1.10804"},{"key":"20016_CR166","doi-asserted-by":"crossref","unstructured":"Zeng X, Wang F, Luo Y, Kang S-g, Tang J, Lightstone FC, Fang EF, Cornell W, Nussinov R, Cheng F (2022) Deep generative molecular design reshapes drug discovery. Cell Reports Medicine","DOI":"10.1016\/j.xcrm.2022.100794"},{"key":"20016_CR167","unstructured":"Zhang H, Goodfellow I, Metaxas D, Odena A (2019a) Self-attention generative adversarial networks. In: International conference on machine learning, pp 7354\u20137363. PMLR"},{"key":"20016_CR168","doi-asserted-by":"crossref","unstructured":"Zhang R, Isola P, Efros AA, Shechtman E, Wang O (2018) The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 586\u2013595","DOI":"10.1109\/CVPR.2018.00068"},{"key":"20016_CR169","doi-asserted-by":"crossref","unstructured":"Zhang Z, Han X, Liu Z, Jiang X, Sun M, Liu Q (2019b) Ernie: Enhanced language representation with informative entities. arXiv preprint arXiv:1905.07129","DOI":"10.18653\/v1\/P19-1139"},{"key":"20016_CR170","doi-asserted-by":"crossref","unstructured":"Zhang Z, Li L, Ding Y, Fan C (2021) Flow-guided one-shot talking face generation with a high-resolution audio-visual dataset. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 3661\u20133670","DOI":"10.1109\/CVPR46437.2021.00366"},{"key":"20016_CR171","first-page":"1","volume":"19","author":"Y Zhao","year":"2022","unstructured":"Zhao Y, Celik T, Liu N, Li H-C (2022) A comparative analysis of gan-based methods for sar-to-optical image translation. IEEE Geosci Remote Sens Lett 19:1\u20135","journal-title":"IEEE Geosci Remote Sens Lett"},{"key":"20016_CR172","doi-asserted-by":"crossref","unstructured":"Zhong M, Yin D, Yu T, Zaidi A, Mutuma M, Jha R, Awadallah AH, Celikyilmaz A, Liu Y, Qiu X, Radev D (2021) QMSum: A new benchmark for query-based multi-domain meeting summarization. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 5905\u20135921, Online. Association for Computational Linguistics","DOI":"10.18653\/v1\/2021.naacl-main.472"},{"issue":"6","key":"20016_CR173","first-page":"1","volume":"39","author":"Y Zhou","year":"2020","unstructured":"Zhou Y, Han X, Shechtman E, Echevarria J, Kalogerakis E, Li D (2020) Makelttalk: speaker-aware talking-head animation. ACM Transactions On Graphics (TOG) 39(6):1\u201315","journal-title":"ACM Transactions On Graphics (TOG)"},{"key":"20016_CR174","doi-asserted-by":"crossref","unstructured":"Zhu C, Xu R, Zeng M, Huang X (2020) A hierarchical network for abstractive meeting summarization with cross-domain pretraining. In: Findings of the Association for Computational Linguistics: EMNLP 2020, pp 194\u2013203, Online. Association for Computational Linguistics","DOI":"10.18653\/v1\/2020.findings-emnlp.19"},{"key":"20016_CR175","doi-asserted-by":"crossref","unstructured":"Zhu J-Y, Park T, Isola P, Efros AA (2017a) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: 2017 IEEE international conference on computer vision (ICCV), pp 2242\u20132251","DOI":"10.1109\/ICCV.2017.244"},{"key":"20016_CR176","unstructured":"Zhu J-Y, Zhang R, Pathak D, Darrell T, Efros AA, Wang O, Shechtman E (2017b) Toward multimodal image-to-image translation. Adv Neural Inform Process Syst 30"},{"key":"20016_CR177","volume-title":"Advances in Neural Information Processing Systems","author":"J-Y Zhu","year":"2017","unstructured":"Zhu J-Y, Zhang R, Pathak D, Darrell T, Efros AA, Wang O, Shechtman E (2017) Toward multimodal image-to-image translation. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in Neural Information Processing Systems, vol 30. Curran Associates Inc"},{"key":"20016_CR178","doi-asserted-by":"crossref","unstructured":"Zuo Z, Zhao L, Lian S, Chen H, Wang Z, Li A, Xing W, Lu D (2022) Style fader generative adversarial networks for style degree controllable artistic style transfer. In: Proceedings of International Joint Conference on Artificial Intelligence (IJCAI), pp 5002\u20135009","DOI":"10.24963\/ijcai.2022\/693"}],"container-title":["Multimedia Tools and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11042-024-20016-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11042-024-20016-1\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11042-024-20016-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,2]],"date-time":"2025-07-02T13:58:35Z","timestamp":1751464715000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11042-024-20016-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,8,14]]},"references-count":178,"journal-issue":{"issue":"21","published-online":{"date-parts":[[2025,6]]}},"alternative-id":["20016"],"URL":"https:\/\/doi.org\/10.1007\/s11042-024-20016-1","relation":{},"ISSN":["1573-7721"],"issn-type":[{"value":"1573-7721","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,8,14]]},"assertion":[{"value":"9 March 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 June 2024","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"4 August 2024","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"14 August 2024","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not Required.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval"}},{"value":"All authors declare that they have no conflict of interest.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}