{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T03:11:51Z","timestamp":1760238711779,"version":"build-2065373602"},"reference-count":60,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2020,8,25]],"date-time":"2020-08-25T00:00:00Z","timestamp":1598313600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100004343","name":"Stavros Niarchos Foundation","doi-asserted-by":"publisher","award":["HFRI faculty grant no.  1725"],"award-info":[{"award-number":["HFRI faculty grant no.  1725"]}],"id":[{"id":"10.13039\/501100004343","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100013209","name":"Hellenic Foundation for Research and Innovation","doi-asserted-by":"publisher","award":["HFRI faculty grant no.  1725"],"award-info":[{"award-number":["HFRI faculty grant no.  1725"]}],"id":[{"id":"10.13039\/501100013209","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100003448","name":"General Secretariat for Research and Technology","doi-asserted-by":"publisher","award":["HFRI faculty grant no.  1725"],"award-info":[{"award-number":["HFRI faculty grant no.  1725"]}],"id":[{"id":"10.13039\/501100003448","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["MAKE"],"abstract":"<jats:p>In spatio-temporal predictive coding problems, like next-frame prediction in video, determining the content of plausible future frames is primarily based on the image dynamics of previous frames. We establish an alternative approach based on their underlying semantic information when considering data that do not necessarily incorporate a temporal aspect, but instead they comply with some form of associative ordering. In this work, we introduce the notion of semantic predictive coding by proposing a novel generative adversarial modeling framework which incorporates the arbiter classifier as a new component. While the generator is primarily tasked with the anticipation of possible next frames, the arbiter\u2019s principal role is the assessment of their credibility. Taking into account that the denotative meaning of each forthcoming element can be encapsulated in a generic label descriptive of its content, a classification loss is introduced along with the adversarial loss. As supported by our experimental findings in a next-digit and a next-letter scenario, the utilization of the arbiter not only results in an enhanced GAN performance, but it also broadens the network\u2019s creative capabilities in terms of the diversity of the generated symbols.<\/jats:p>","DOI":"10.3390\/make2030017","type":"journal-article","created":{"date-parts":[[2020,8,25]],"date-time":"2020-08-25T09:30:07Z","timestamp":1598347807000},"page":"307-326","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Semantic Predictive Coding with Arbitrated Generative Adversarial Networks"],"prefix":"10.3390","volume":"2","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9481-2485","authenticated-orcid":false,"given":"Radamanthys","family":"Stivaktakis","sequence":"first","affiliation":[{"name":"Institute of Computer Science, Foundation for Research and Technology-Hellas (FORTH), 70013 Crete, Greece"},{"name":"Computer Science Department, University of Crete, 70013 Crete, Greece"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6498-9450","authenticated-orcid":false,"given":"Grigorios","family":"Tsagkatakis","sequence":"additional","affiliation":[{"name":"Institute of Computer Science, Foundation for Research and Technology-Hellas (FORTH), 70013 Crete, Greece"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4918-603X","authenticated-orcid":false,"given":"Panagiotis","family":"Tsakalides","sequence":"additional","affiliation":[{"name":"Institute of Computer Science, Foundation for Research and Technology-Hellas (FORTH), 70013 Crete, Greece"},{"name":"Computer Science Department, University of Crete, 70013 Crete, Greece"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2020,8,25]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"LeCun","year":"2015","journal-title":"Nature"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long short-term memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput."},{"key":"ref_3","unstructured":"Krizhevsky, A., Sutskever, I., and Hinton, G.E. (2012, January 3\u20136). Imagenet classification with deep convolutional neural networks. Proceedings of the Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Rumelhart, D.E., Hinton, G.E., and Williams, R.J. (1985). Learning Internal Representations by Error Propagation, California University San Diego, La Jolla Institute for Cognitive Science. Technical report.","DOI":"10.21236\/ADA164453"},{"key":"ref_6","first-page":"427","article-title":"Predictive coding: A fresh view of inhibition in the retina","volume":"216","author":"Srinivasan","year":"1982","journal-title":"Proc. R. Soc. Lond. Ser. B Biol. Sci."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1038\/306021a0","article-title":"Parallel visual computation","volume":"306","author":"Ballard","year":"1983","journal-title":"Nature"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1038\/4580","article-title":"Predictive coding in the visual cortex: A functional interpretation of some extra-classical receptive-field effects","volume":"2","author":"Rao","year":"1999","journal-title":"Nat. Neurosci."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1211","DOI":"10.1098\/rstb.2008.0300","article-title":"Predictive coding under the free-energy principle","volume":"364","author":"Friston","year":"2009","journal-title":"Philos. Trans. R. Soc. B Biol. Sci."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"695","DOI":"10.1016\/j.neuron.2012.10.038","article-title":"Canonical microcircuits for predictive coding","volume":"76","author":"Bastos","year":"2012","journal-title":"Neuron"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1019","DOI":"10.1038\/s41593-018-0200-7","article-title":"Does predictive coding have a future?","volume":"21","author":"Friston","year":"2018","journal-title":"Nat. Neurosci."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"69273","DOI":"10.1109\/ACCESS.2020.2987281","article-title":"Deep Learning in Next-Frame Prediction: A Benchmark Review","volume":"8","author":"Zhou","year":"2020","journal-title":"IEEE Access"},{"key":"ref_13","unstructured":"Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. (2014, January 8\u201313). Generative adversarial nets. Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_14","unstructured":"Vondrick, C., Pirsiavash, H., and Torralba, A. (2016, January 5\u201310). Generating videos with scene dynamics. Proceedings of the Advances in Neural Information Processing Systems, Barcelona, Spain."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Tulyakov, S., Liu, M.Y., Yang, X., and Kautz, J. (2018, January 18\u201323). Mocogan: Decomposing motion and content for video generation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00165"},{"key":"ref_16","unstructured":"Wang, Y., Jiang, L., Yang, M.H., Li, L.J., Long, M., and Fei-Fei, L. (2019, January 6\u20139). Eidetic 3D lstm: A model for video prediction and beyond. Proceedings of the International Conference on Learning Representations, New Orleans, LA, USA."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Saito, M., Matsumoto, E., and Saito, S. (2017, January 22\u201327). Temporal generative adversarial nets with singular value clipping. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.308"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"533","DOI":"10.1038\/323533a0","article-title":"Learning representations by back-propagating errors","volume":"323","author":"Rumelhart","year":"1986","journal-title":"Nature"},{"key":"ref_19","unstructured":"Michalski, V., Memisevic, R., and Konda, K. (2014, January 8\u201313). Modeling deep temporal dependencies with recurrent grammar cells. Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"1829","DOI":"10.1109\/TPAMI.2013.53","article-title":"Learning to relate images","volume":"35","author":"Memisevic","year":"2013","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_21","unstructured":"Srivastava, N., Mansimov, E., and Salakhudinov, R. (2015, January 6\u201311). Unsupervised learning of video representations using lstms. Proceedings of the 32nd International Conference on Machine Learning, Lille, France."},{"key":"ref_22","unstructured":"Xingjian, S., Chen, Z., Wang, H., Yeung, D.Y., Wong, W.K., and Woo, W.c. (2015, January 7\u201312). Convolutional LSTM network: A machine learning approach for precipitation nowcasting. Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_23","unstructured":"Lotter, W., Kreiman, G., and Cox, D. (2017, January 24\u201326). Deep predictive coding networks for video prediction and unsupervised learning. Proceedings of the International Conference on Learning Representations, Toulon, France."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Rane, R.P., Sz\u00fcgyi, E., Saxena, V., Ofner, A., and Stober, S. (2020, January 8\u201311). PredNet and Predictive Coding: A Critical Review. Proceedings of the 2020 International Conference on Multimedia Retrieval, Dublin, Ireland.","DOI":"10.1145\/3372278.3390694"},{"key":"ref_25","unstructured":"Villegas, R., Yang, J., Hong, S., Lin, X., and Lee, H. (2017, January 24\u201326). Decomposing motion and content for natural video sequence prediction. Proceedings of the International Conference on Learning Representations, Toulon, France."},{"key":"ref_26","unstructured":"Wang, Y., Long, M., Wang, J., Gao, Z., and Philip, S.Y. (2017, January 4\u20139). Predrnn: Recurrent neural networks for predictive learning using spatiotemporal lstms. Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA."},{"key":"ref_27","unstructured":"Wang, Y., Gao, Z., Long, M., Wang, J., and Yu, P.S. (2018, January 10\u201315). Predrnn++: Towards a resolution of the deep-in-time dilemma in spatiotemporal predictive learning. Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden."},{"key":"ref_28","unstructured":"Mathieu, M., Couprie, C., and LeCun, Y. (2016, January 2\u20134). Deep multi-scale video prediction beyond mean square error. Proceedings of the International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico."},{"key":"ref_29","unstructured":"Radford, A., Metz, L., and Chintala, S. (2016, January 2\u20134). Unsupervised representation learning with deep convolutional generative adversarial networks. Proceedings of the International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico."},{"key":"ref_30","unstructured":"Lotter, W., Kreiman, G., and Cox, D. (2016, January 2\u20134). Unsupervised learning of visual structure using predictive generative networks. Proceedings of the International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Zhou, Y., and Berg, T.L. (2016, January 11\u201314). Learning temporal transformations from time-lapse videos. Proceedings of the European Conference on Computer Vision, ECCV 2016, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46484-8_16"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Liang, X., Lee, L., Dai, W., and Xing, E.P. (2017, January 22\u201329). Dual motion GAN for future-flow embedded video prediction. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.194"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Lu, C., Hirsch, M., and Scholkopf, B. (2017, January 21\u201326). Flexible spatio-temporal networks for video prediction. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.230"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Vondrick, C., and Torralba, A. (2017, January 21\u201326). Generating the future with adversarial transformers. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.319"},{"key":"ref_35","unstructured":"Bhattacharjee, P., and Das, S. (2017, January 4\u20139). Temporal coherency based criteria for predicting video frames using deep multi-stage generative adversarial networks. Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA."},{"key":"ref_36","unstructured":"Wichers, N., Villegas, R., Erhan, D., and Lee, H. (2018, January 10\u201315). Hierarchical long-term video prediction without supervision. Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholm, Sweden."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Kwon, Y.H., and Park, M.G. (2019, January 15\u201321). Predicting future frames using retrospective cycle gan. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00191"},{"key":"ref_38","first-page":"3","article-title":"FUTUREGAN: Anricipating the future frames of video sequences using spatio-temporal 3D convolutions in progressively growing gans","volume":"XLII-2\/W16","author":"Aigner","year":"2019","journal-title":"ISPRS Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Ledig, C., Theis, L., Husz\u00e1r, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., and Wang, Z. (2017, January 21\u201326). Photo-realistic single image super-resolution using a generative adversarial network. Proceedings of the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.19"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"3312","DOI":"10.1109\/TIP.2019.2895768","article-title":"Generative adversarial networks and perceptual losses for video super-resolution","volume":"28","author":"Lucas","year":"2019","journal-title":"IEEE Trans. Image Process."},{"key":"ref_41","unstructured":"Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., and Lee, H. (2016, January 20\u201322). Generative adversarial text to image synthesis. Proceedings of the 33rd International Conference on Machine Learning, ICML, New York, NY, USA."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Zhang, H., Xu, T., Li, H., Zhang, S., Wang, X., Huang, X., and Metaxas, D.N. (2017, January 22\u201329). Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.629"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Liu, X., Meng, G., Xiang, S., and Pan, C. (2018, January 20\u201324). Semantic image synthesis via conditional cycle-generative adversarial networks. Proceedings of the 2018 24th International Conference on Pattern Recognition (ICPR), IEEE, Beijing, China.","DOI":"10.1109\/ICPR.2018.8545383"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., and Efros, A.A. (2016, January 27\u201330). Context encoders: Feature learning by inpainting. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.278"},{"key":"ref_45","unstructured":"Denton, E.L., Chintala, S., and Fergus, R. (2015, January 7\u201312). Deep generative image models using a laplacian pyramid of adversarial networks. Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Li, C., and Wand, M. (2016, January 11\u201314). Precomputed real-time texture synthesis with markovian generative adversarial networks. Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46487-9_43"},{"key":"ref_47","unstructured":"Larsen, A.B.L., S\u00f8nderby, S.K., Larochelle, H., and Winther, O. (2016, January 20\u201322). Autoencoding beyond pixels using a learned similarity metric. Proceedings of the 33rd International Conference on Machine Learning, New York, NY, USA."},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Karras, T., Laine, S., and Aila, T. (2019, January 15\u201320). A style-based generator architecture for generative adversarial networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00453"},{"key":"ref_49","unstructured":"Mirza, M., and Osindero, S. (2014). Conditional generative adversarial nets. arXiv."},{"key":"ref_50","unstructured":"Arjovsky, M., Chintala, S., and Bottou, L. (2017, January 6\u201311). Wasserstein Generative Adversarial Networks. Proceedings of the 34th International Conference on Machine Learning, Sydney, NSW, Australia."},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"458","DOI":"10.1137\/1115049","article-title":"Prescribing a system of random variables by conditional distributions","volume":"15","author":"Dobrushin","year":"1970","journal-title":"Theory Probab. Appl."},{"key":"ref_52","unstructured":"Liu, M.Y., and Tuzel, O. (2016, January 5\u201310). Coupled generative adversarial networks. Proceedings of the Advances in Neural Information Processing Systems, Barcelona, Spain."},{"key":"ref_53","unstructured":"LeCun, Y., Cortes, C., and Burges, C. (2010). MNIST handwritten digit database. ATT Labs, 2, Available online: http:\/\/yann.lecun.com\/exdb\/mnist."},{"key":"ref_54","doi-asserted-by":"crossref","unstructured":"Cohen, G., Afshar, S., Tapson, J., and Schaik, A.V. (2017, January 14\u201319). EMNIST: Extending MNIST to handwritten letters. Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN), Anchorage, AK, USA.","DOI":"10.1109\/IJCNN.2017.7966217"},{"key":"ref_55","unstructured":"Nair, V., and Hinton, G.E. (2010, January 21\u201324). Rectified linear units improve restricted boltzmann machines. Proceedings of the 27th International Conference on Machine Learning, ICML, Haifa, Israel."},{"key":"ref_56","unstructured":"Xu, B., Wang, N., Chen, T., and Li, M. (2015). Empirical evaluation of rectified activations in convolutional network. arXiv."},{"key":"ref_57","doi-asserted-by":"crossref","first-page":"541","DOI":"10.1162\/neco.1989.1.4.541","article-title":"Backpropagation applied to handwritten zip code recognition","volume":"1","author":"LeCun","year":"1989","journal-title":"Neural Comput."},{"key":"ref_58","first-page":"1929","article-title":"Dropout: A simple way to prevent neural networks from overfitting","volume":"15","author":"Srivastava","year":"2014","journal-title":"J. Mach. Learn. Res."},{"key":"ref_59","unstructured":"Ioffe, S., and Szegedy, C. (2015, January 6\u201311). Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. Proceedings of the 32nd International Conference on Machine Learning, Lille, France."},{"key":"ref_60","doi-asserted-by":"crossref","first-page":"600","DOI":"10.1109\/TIP.2003.819861","article-title":"Image quality assessment: From error visibility to structural similarity","volume":"13","author":"Wang","year":"2004","journal-title":"IEEE Trans. Image Process."}],"container-title":["Machine Learning and Knowledge Extraction"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-4990\/2\/3\/17\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T10:06:18Z","timestamp":1760177178000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-4990\/2\/3\/17"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,8,25]]},"references-count":60,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2020,9]]}},"alternative-id":["make2030017"],"URL":"https:\/\/doi.org\/10.3390\/make2030017","relation":{},"ISSN":["2504-4990"],"issn-type":[{"type":"electronic","value":"2504-4990"}],"subject":[],"published":{"date-parts":[[2020,8,25]]}}}