{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,19]],"date-time":"2026-02-19T14:49:14Z","timestamp":1771512554937,"version":"3.50.1"},"reference-count":56,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2021,3,5]],"date-time":"2021-03-05T00:00:00Z","timestamp":1614902400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"Natural Science Foundation of China","doi-asserted-by":"crossref","award":["61906060 and 91746209"],"award-info":[{"award-number":["61906060 and 91746209"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"National Key Research and Development Program of China","award":["2016YFB1000901"],"award-info":[{"award-number":["2016YFB1000901"]}]},{"name":"Ministry of Education","award":["IRT17R32"],"award-info":[{"award-number":["IRT17R32"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Knowl. Discov. Data"],"published-print":{"date-parts":[[2021,4,30]]},"abstract":"<jats:p>Deep learning seeks to achieve excellent performance for representation learning in image datasets. However, supervised deep learning models such as convolutional neural networks require a large number of labeled image data, which is intractable in applications, while unsupervised deep learning models like stacked denoising auto-encoder cannot employ label information. Meanwhile, the redundancy of image data incurs performance degradation on representation learning for aforementioned models. To address these problems, we propose a semi-supervised deep learning framework called stacked convolutional sparse auto-encoder, which can learn robust and sparse representations from image data with fewer labeled data records. More specifically, the framework is constructed by stacking layers. In each layer, higher layer feature representations are generated by features of lower layers in a convolutional way with kernels learned by a sparse auto-encoder. Meanwhile, to solve the data redundance problem, the algorithm of Reconstruction Independent Component Analysis is designed to train on patches for sphering the input data. The label information is encoded using a Softmax Regression model for semi-supervised learning. With this framework, higher level representations are learned by layers mapping from image data. It can boost the performance of the base subsequent classifiers such as support vector machines. Extensive experiments demonstrate the superior classification performance of our framework compared to several state-of-the-art representation learning methods.<\/jats:p>","DOI":"10.1145\/3434767","type":"journal-article","created":{"date-parts":[[2021,3,5]],"date-time":"2021-03-05T11:11:56Z","timestamp":1614942716000},"page":"1-21","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":13,"title":["Stacked Convolutional Sparse Auto-Encoders for Representation Learning"],"prefix":"10.1145","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3045-2588","authenticated-orcid":false,"given":"Yi","family":"Zhu","sequence":"first","affiliation":[{"name":"Yangzhou University, Hefei University of Technology, China"}]},{"given":"Lei","family":"Li","sequence":"additional","affiliation":[{"name":"Hefei University of Technology, China"}]},{"given":"Xindong","family":"Wu","sequence":"additional","affiliation":[{"name":"Mininglamp Academy of Sciences, Minininglamp and Key Laboratory of Knowledge Engineering with Big Data (Hefei University of Technology), Ministry of Education, Hefei, China"}]}],"member":"320","published-online":{"date-parts":[[2021,3,5]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.5555\/648029.745336"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1146\/annurev.ne.08.030185.002203"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1561\/2200000006"},{"key":"e_1_2_1_4_1","unstructured":"Yoshua Bengio Yann LeCun and Donnie Henderson. 1993. Globally trained handwritten word recognizer using spatial representation convolutional neural networks and hidden Markov models. In Advances in Neural Information Processing Systems. 937--944.  Yoshua Bengio Yann LeCun and Donnie Henderson. 1993. Globally trained handwritten word recognizer using spatial representation convolutional neural networks and hidden Markov models. In Advances in Neural Information Processing Systems. 937--944."},{"key":"e_1_2_1_5_1","volume-title":"International Conference on Machine Learning. 111--118","author":"Boureau Lan","year":"2010"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46493-0_22"},{"key":"e_1_2_1_7_1","volume-title":"International Conference on Machine Learning. 767--774","author":"Chen Minmin","year":"2012"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2019\/293"},{"key":"e_1_2_1_9_1","volume-title":"International Conference on Artificial Intelligence and Statistics. 215--223","author":"Coates Adam","year":"2011"},{"key":"e_1_2_1_10_1","volume-title":"International Conference on Machine Learning. 647--655","author":"Donahue Jeff","year":"2014"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCYB.2016.2536638"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/3292500.3330673"},{"key":"e_1_2_1_13_1","volume-title":"Fast R-CNN. In IEEE Conference on Computer Vision and Pattern Recognition. 1440--1448","author":"Girshick Ross","year":"2015"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2014.81"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46672-9_10"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.asoc.2018.03.032"},{"key":"e_1_2_1_17_1","volume-title":"Mask R-CNN. In IEEE Conference on Computer Vision and Pattern Recognition. 2980--2988","author":"He Kaiming","year":"2017"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00065"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7299149"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1162\/neco.2006.18.7.1527"},{"key":"e_1_2_1_22_1","volume-title":"LSDA: Large scale detection through adaptation. In Advances in Neural Information Processing Systems. 3536--3544.","author":"Hoffman Judy","year":"2014"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i04.5833"},{"key":"e_1_2_1_24_1","volume-title":"Hinton","author":"Krizhevsky Alex","year":"2012"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/72.554195"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2006.68"},{"key":"e_1_2_1_27_1","unstructured":"Quoc V. Le Alexandre Karpenko Jiquan Ngiam and Andrew Ng. 2011. ICA with reconstruction cost for efficient overcomplete feature learning. In Advances in Neural Information Processing Systems. 1017--1025.  Quoc V. Le Alexandre Karpenko Jiquan Ngiam and Andrew Ng. 2011. ICA with reconstruction cost for efficient overcomplete feature learning. In Advances in Neural Information Processing Systems. 1017--1025."},{"key":"e_1_2_1_28_1","first-page":"1995","article-title":"Convolutional networks for images, speech, and time series","volume":"3361","author":"LeCun Yann","year":"1995","journal-title":"The Handbook of Brain Theory and Neural Networks"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/5.726791"},{"key":"e_1_2_1_30_1","volume-title":"Workshop on Challenges in Representation Learning, International Conference on Machine Learning. 2--8.","author":"Lee Dong-Hyun","year":"2013"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/1553374.1553453"},{"key":"e_1_2_1_32_1","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition Workshop. 178--178","author":"Li Feifei","year":"2004"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1002\/aic.12419"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1488"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/3269206.3272010"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-21735-7_7"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2009.5206577"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2014.222"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.measurement.2016.04.007"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-15567-3_11"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.5555\/1756006.1953039"},{"key":"e_1_2_1_43_1","volume-title":"The World Wide Web Conference. 2022--2032","author":"Wang Xiao"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-66179-7_54"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00985"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2015.2510498"},{"key":"e_1_2_1_47_1","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition. 1794--1801","author":"Yang Jianchao","year":"2009"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.3115\/981658.981684"},{"key":"e_1_2_1_49_1","volume-title":"International Conference on Image Processing. 338--341","author":"Minerva"},{"key":"e_1_2_1_50_1","unstructured":"Rex Ying Jiaxuan You Christopher Morris Xiang Ren Will Hamilton and Jure Leskovec. 2018. Hierarchical graph representation learning with differentiable pooling. In Advances in Neural Information Processing Systems. 4800--4810.  Rex Ying Jiaxuan You Christopher Morris Xiang Ren Will Hamilton and Jure Leskovec. 2018. Hierarchical graph representation learning with differentiable pooling. In Advances in Neural Information Processing Systems. 4800--4810."},{"key":"e_1_2_1_51_1","volume-title":"IEEE Conference on Computer Vision. 4471--4480","author":"Yu Jiahui"},{"key":"e_1_2_1_52_1","volume-title":"European Conference on Computer Vision. 818--833","author":"Matthew"},{"key":"e_1_2_1_53_1","volume-title":"25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 793--803","author":"Zhang Chuxu"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1109\/TGRS.2014.2337883"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2018.04.010"},{"key":"e_1_2_1_56_1","volume-title":"IEEE International Conference on Data Mining. 1141--1146","author":"Zhuang Fuzhen","year":"2016"},{"key":"e_1_2_1_57_1","unstructured":"Will Zou Shenghuo Zhu Kai Yu and Andrew Ng. 2012. Deep learning of invariant features via simulated fixations in video. In Advances in Neural Information Processing Systems. 3203--3211.  Will Zou Shenghuo Zhu Kai Yu and Andrew Ng. 2012. Deep learning of invariant features via simulated fixations in video. In Advances in Neural Information Processing Systems. 3203--3211."}],"container-title":["ACM Transactions on Knowledge Discovery from Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3434767","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3434767","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T21:31:58Z","timestamp":1750195918000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3434767"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,3,5]]},"references-count":56,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2021,4,30]]}},"alternative-id":["10.1145\/3434767"],"URL":"https:\/\/doi.org\/10.1145\/3434767","relation":{},"ISSN":["1556-4681","1556-472X"],"issn-type":[{"value":"1556-4681","type":"print"},{"value":"1556-472X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,3,5]]},"assertion":[{"value":"2019-10-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-11-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-03-05","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}