{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,1,18]],"date-time":"2024-01-18T00:30:54Z","timestamp":1705537854960},"reference-count":31,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2014,9,30]],"date-time":"2014-09-30T00:00:00Z","timestamp":1412035200000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int J Comput Vis"],"published-print":{"date-parts":[[2015,5]]},"DOI":"10.1007\/s11263-014-0765-x","type":"journal-article","created":{"date-parts":[[2014,9,29]],"date-time":"2014-09-29T16:01:55Z","timestamp":1412006515000},"page":"67-79","update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["A Neural Autoregressive Approach to Attention-based Recognition"],"prefix":"10.1007","volume":"113","author":[{"given":"Yin","family":"Zheng","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Richard S.","family":"Zemel","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yu-Jin","family":"Zhang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hugo","family":"Larochelle","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2014,9,30]]},"reference":[{"key":"765_CR1","unstructured":"Bazzani, L., Freitas, N., Larochelle, H., Murino, V., & Ting, J.-A. (2011). Learning attentional policies for tracking and recognition in video with deep networks. In Proceedings of the 28th international conference on machine learning (ICML 2011) (pp. 937\u2013944). ACM."},{"issue":"2","key":"765_CR2","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1109\/TAMD.2010.2051029","volume":"2","author":"NJ Butko","year":"2010","unstructured":"Butko, N. J., & Movellan, J. R. (2010). Infomax control of eye movements. IEEE Transactions on Autonomous Mental Development, 2(2), 91\u2013107.","journal-title":"IEEE Transactions on Autonomous Mental Development"},{"key":"765_CR3","doi-asserted-by":"crossref","unstructured":"Cheng, M.-M., Zhang, G.-X., Mitra, N. J., Huang, X., & Hu, S.-M. (2011). Global contrast based salient region detection. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2011 (pp. 409\u2013416). IEEE.","DOI":"10.1109\/CVPR.2011.5995344"},{"key":"765_CR4","doi-asserted-by":"crossref","unstructured":"Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In IEEE computer society conference on computer vision and pattern recognition. CVPR 2005 (Vol. 1, pp. 886\u2013893). IEEE.","DOI":"10.1109\/CVPR.2005.177"},{"issue":"2","key":"765_CR5","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1023\/B:VISI.0000029664.99615.94","volume":"60","author":"G David","year":"2004","unstructured":"David, G. (2004). Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91\u2013110.","journal-title":"International Journal of Computer Vision"},{"issue":"8","key":"765_CR6","doi-asserted-by":"crossref","first-page":"2151","DOI":"10.1162\/NECO_a_00312","volume":"24","author":"M Denil","year":"2012","unstructured":"Denil, M., Bazzani, L., Larochelle, H., & de Freitas, N. (2012). Learning where to attend with deep architectures for image tracking. Neural Computation, 24(8), 2151\u20132184.","journal-title":"Neural Computation"},{"key":"765_CR7","doi-asserted-by":"crossref","unstructured":"Erez, T., Tramper, J. J., Smart, W. D., & Stan CAM Gielen. (2011). A pomdp model of eye-hand coordination. In AAAI.","DOI":"10.1609\/aaai.v25i1.8007"},{"issue":"1","key":"765_CR8","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.cogpsych.2008.05.001","volume":"58","author":"A Fazl","year":"2009","unstructured":"Fazl, A., Grossberg, S., & Mingolla, E. (2009). View-invariant object category learning, recognition, and search: How spatial and object attention are coordinated using surface-based attentional shrouds. Cognitive psychology, 58(1), 1\u201348.","journal-title":"Cognitive psychology"},{"key":"765_CR9","unstructured":"Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580 ."},{"issue":"8","key":"765_CR10","doi-asserted-by":"crossref","first-page":"1771","DOI":"10.1162\/089976602760128018","volume":"14","author":"GE Hinton","year":"2002","unstructured":"Hinton, G. E. (2002). Training products of experts by minimizing contrastive divergence. Neural Computation, 14(8), 1771\u20131800.","journal-title":"Neural Computation"},{"key":"765_CR11","doi-asserted-by":"crossref","unstructured":"Itti, L., Koch, C., & Niebur, E. (1998). A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11), 1254\u20131259.","DOI":"10.1109\/34.730558"},{"key":"765_CR12","doi-asserted-by":"crossref","unstructured":"Judd, T., Ehinger, K., Durand, F., & Torralba, A. (2009). Learning to predict where humans look. In IEEE International Conference on Computer Vision (ICCV).","DOI":"10.1109\/ICCV.2009.5459462"},{"key":"765_CR13","doi-asserted-by":"crossref","unstructured":"Kanan, C., & Cottrell, G. (2010) Robust classification of objects, faces, and flowers using natural image statistics. In CVPR.","DOI":"10.1109\/CVPR.2010.5539947"},{"key":"765_CR14","unstructured":"Krause, A., & Ong, C. S. (2011). Contextual gaussian process bandit optimization. In NIPS (pp. 2447\u20132455)."},{"key":"765_CR15","first-page":"1106","volume":"25","author":"A Krizhevsky","year":"2012","unstructured":"Krizhevsky, A., Sutskever, I., & Hinton, G. (2012). Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25, 1106\u20131114.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"765_CR16","doi-asserted-by":"crossref","unstructured":"Larochelle, H., & Bengio, Y. (2008). Classification using discriminative restricted boltzmann machines. In Proceedings of the 25th international conference on machine learning (pp. 536\u2013543). ACM.","DOI":"10.1145\/1390156.1390224"},{"key":"765_CR17","unstructured":"Larochelle, H., & Hinton, G. E. (2010). Learning to combine foveal glimpses with a third-order Boltzmann machine. In Advances in neural information processing systems (pp. 1243\u20131251)."},{"key":"765_CR18","first-page":"29","volume":"15","author":"H Larochelle","year":"2011","unstructured":"Larochelle, H., & Murray, I. (2011). The neural autoregressive distribution estimator. Artificial Intelligence and Statistics (AISTATS), 15, 29\u201337.","journal-title":"Artificial Intelligence and Statistics (AISTATS)"},{"key":"765_CR19","first-page":"2717","volume":"25","author":"H Larochelle","year":"2012","unstructured":"Larochelle, H., & Lauly, S. (2012). A neural autoregressive topic model. Advances in Neural Information Processing Systems, 25, 2717\u20132725.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"765_CR20","unstructured":"Lazebnik, S. (2006). Cordelia, and Jean Ponce. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In CVPR."},{"key":"765_CR21","unstructured":"Mathe, S., & Sminchisescu, C. (2013). Action from still image dataset and inverse optimal control to learn task specific visual scanpaths. In Advances in neural information processing systems (pp. 1923\u20131931, 2013)."},{"key":"765_CR22","unstructured":"Nair, V., & Hinton, G. E. (2010) Rectified linear units improve restricted boltzmann machines. In ICML."},{"issue":"7031","key":"765_CR23","doi-asserted-by":"crossref","first-page":"387","DOI":"10.1038\/nature03390","volume":"434","author":"J Najemnik","year":"2005","unstructured":"Najemnik, J., & Geisler, W. S. (2005). Optimal eye movement strategies in visual search. Nature, 434(7031), 387\u2013391.","journal-title":"Nature"},{"key":"765_CR24","doi-asserted-by":"crossref","unstructured":"Perazzi, F., Krahenbuhl, P., Pritch, Y., & Hornung, A. (2012). Saliency filters: Contrast based filtering for salient region detection. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012 (pp. 733\u2013740). IEEE.","DOI":"10.1109\/CVPR.2012.6247743"},{"key":"765_CR25","unstructured":"Rifai, S., Vincent, P., Muller, X., Glorot, X., & Bengio, Y. (2011). Contractive auto-encoders: Explicit invariance during feature extraction. In Proceedings of the 28th international conference on machine learning (ICML 2011)."},{"key":"765_CR26","doi-asserted-by":"crossref","unstructured":"Schmidhuber, J., & Huber, R. (1991). Learning to generate artificial fovea trajectories for target detection. International Journal of Neural Systems, 2(01n02), 125\u2013134.","DOI":"10.1142\/S012906579100011X"},{"key":"765_CR27","unstructured":"Southall, J. P. C. (1962). Helmholtzs treatise on physiological optics. vol. 2: The sensation of vision, trans. J. P. C. Southall. (translated from the third german edition)."},{"key":"765_CR28","unstructured":"Susskind, J. M., Anderson, A. K., & Hinton, G. E. (2010). The toronto face database. Department of Computer Science, University of Toronto, Toronto, ON, Canada, Tech. Rep."},{"key":"765_CR29","first-page":"2175","volume":"26","author":"B Uria","year":"2013","unstructured":"Uria, B., Murray, I., & Larochelle, H. (2013). Rnade: The real-valued neural autoregressive density-estimator. Advances in Neural Information Processing Systems, 26, 2175\u20132183.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"765_CR30","doi-asserted-by":"crossref","unstructured":"Vincent, P., Larochelle, H., Bengio, Y., & Manzagol, P.-A. (2008). Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th international conference on machine learning (ICML 2008) (pp. 1096\u20131103). ACM.","DOI":"10.1145\/1390156.1390294"},{"key":"765_CR31","unstructured":"Yang, J., Yu., K., & Gong, Y. (2009). Linear spatial pyramid matching using sparse coding for image classification. In CVPR."}],"container-title":["International Journal of Computer Vision"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s11263-014-0765-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s11263-014-0765-x\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s11263-014-0765-x","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,7,16]],"date-time":"2023-07-16T21:49:27Z","timestamp":1689544167000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s11263-014-0765-x"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,9,30]]},"references-count":31,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2015,5]]}},"alternative-id":["765"],"URL":"https:\/\/doi.org\/10.1007\/s11263-014-0765-x","relation":{},"ISSN":["0920-5691","1573-1405"],"issn-type":[{"value":"0920-5691","type":"print"},{"value":"1573-1405","type":"electronic"}],"subject":[],"published":{"date-parts":[[2014,9,30]]}}}