{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,19]],"date-time":"2026-01-19T12:30:05Z","timestamp":1768825805193,"version":"3.49.0"},"reference-count":51,"publisher":"Association for Computing Machinery (ACM)","issue":"5","license":[{"start":{"date-parts":[[2021,10,31]],"date-time":"2021-10-31T00:00:00Z","timestamp":1635638400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"National Key Research and Development Program of China","award":["2019YFB2102100"],"award-info":[{"award-number":["2019YFB2102100"]}]},{"name":"Science and Technology Development Fund of Macau SAR","award":["0015\/201-9\/AKP"],"award-info":[{"award-number":["0015\/201-9\/AKP"]}]},{"DOI":"10.13039\/501100021171","name":"GuangDong Basic and Applied Basic Research Foundation","doi-asserted-by":"crossref","award":["2020B515130004"],"award-info":[{"award-number":["2020B515130004"]}],"id":[{"id":"10.13039\/501100021171","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Key-area Research and Development Program of Guangdong Province","award":["2020B010164003"],"award-info":[{"award-number":["2020B010164003"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Intell. Syst. Technol."],"published-print":{"date-parts":[[2021,10,31]]},"abstract":"<jats:p>\n            Ensemble learning is a widely used technique to train deep convolutional neural networks (CNNs) for improved robustness and accuracy. While existing algorithms usually first train multiple diversified networks and then assemble these networks as an aggregated classifier, we propose a novel learning paradigm, namely,\n            <jats:italic>\u201cIn-Network Ensemble\u201d<\/jats:italic>\n            (\n            <jats:bold>INE<\/jats:bold>\n            ) that incorporates the diversity of multiple models through training a SINGLE deep neural network. Specifically,\n            <jats:bold>INE<\/jats:bold>\n            segments the outputs of the CNN into multiple independent classifiers, where each classifier is further fine-tuned with better accuracy through a so-called\n            <jats:italic>diversified knowledge distillation process<\/jats:italic>\n            . We then aggregate the fine-tuned independent classifiers using an Averaging-and-Softmax operator to obtain the final ensemble classifier. Note that, in the supervised learning settings,\n            <jats:bold>INE<\/jats:bold>\n            starts the CNN training from random, while, under the transfer learning settings, it also could start with a pre-trained model to incorporate the knowledge learned from additional datasets. Extensive experiments have been done using eight large-scale real-world datasets, including CIFAR, ImageNet, and Stanford Cars, among others, as well as common deep network architectures such as VGG, ResNet, and Wide ResNet. We have evaluated the method under two tasks: supervised learning and transfer learning. The results show that\n            <jats:bold>INE<\/jats:bold>\n            outperforms the state-of-the-art algorithms for deep ensemble learning with improved accuracy.\n          <\/jats:p>","DOI":"10.1145\/3473464","type":"journal-article","created":{"date-parts":[[2021,12,21]],"date-time":"2021-12-21T16:42:08Z","timestamp":1640104928000},"page":"1-19","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":11,"title":["\u201cIn-Network Ensemble\u201d: Deep Ensemble Learning with Diversified Knowledge Distillation"],"prefix":"10.1145","volume":"12","author":[{"given":"Xingjian","family":"Li","sequence":"first","affiliation":[{"name":"Baidu Inc., Beijing, China and University of Macau, Taipa, Macau, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Haoyi","family":"Xiong","sequence":"additional","affiliation":[{"name":"Baidu Inc., Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zeyu","family":"Chen","sequence":"additional","affiliation":[{"name":"Baidu Inc., Beijing, Guandong, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jun","family":"Huan","sequence":"additional","affiliation":[{"name":"Baidu Inc., Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Cheng-Zhong","family":"Xu","sequence":"additional","affiliation":[{"name":"University of Macau, Taipa, Macau, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Dejing","family":"Dou","sequence":"additional","affiliation":[{"name":"Baidu Inc., Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2021,12,21]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"International Conference on Machine Learning. PMLR, 242\u2013252","author":"Allen-Zhu Zeyuan","year":"2019","unstructured":"Zeyuan Allen-Zhu , Yuanzhi Li , and Zhao Song . 2019 . A convergence theory for deep learning via over-parameterization . In International Conference on Machine Learning. PMLR, 242\u2013252 . Zeyuan Allen-Zhu, Yuanzhi Li, and Zhao Song. 2019. A convergence theory for deep learning via over-parameterization. In International Conference on Machine Learning. PMLR, 242\u2013252."},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.5555\/2999792.2999926"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.5555\/1390681.1442799"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2014.461"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.5555\/3305381.3305472"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.5555\/648054.743935"},{"key":"e_1_2_1_7_1","volume-title":"The lottery ticket hypothesis: Finding sparse, trainable neural networks. arXiv preprint arXiv:1803.03635","author":"Frankle Jonathan","year":"2018","unstructured":"Jonathan Frankle and Michael Carbin . 2018. The lottery ticket hypothesis: Finding sparse, trainable neural networks. arXiv preprint arXiv:1803.03635 ( 2018 ). Jonathan Frankle and Michael Carbin. 2018. The lottery ticket hypothesis: Finding sparse, trainable neural networks. arXiv preprint arXiv:1803.03635 (2018)."},{"key":"e_1_2_1_8_1","volume-title":"International Conference on Learning Representations.","author":"French Geoff","year":"2018","unstructured":"Geoff French , Michal Mackiewicz , and Mark Fisher . 2018 . Self-ensembling for visual domain adaptation . In International Conference on Learning Representations. Geoff French, Michal Mackiewicz, and Mark Fisher. 2018. Self-ensembling for visual domain adaptation. In International Conference on Learning Representations."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1214\/009053604000000058"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.5555\/3327546.3327556"},{"key":"e_1_2_1_11_1","volume-title":"13th International Conference on Artificial Intelligence and Statistics. 249\u2013256","author":"Glorot Xavier","year":"2010","unstructured":"Xavier Glorot and Yoshua Bengio . 2010 . Understanding the difficulty of training deep feedforward neural networks . In 13th International Conference on Artificial Intelligence and Statistics. 249\u2013256 . Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In 13th International Conference on Artificial Intelligence and Statistics. 249\u2013256."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.5555\/3086952"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.5555\/3042817.3043084"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.63"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/34.58871"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-44781-0_9"},{"key":"e_1_2_1_17_1","volume-title":"Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531","author":"Hinton Geoffrey","year":"2015","unstructured":"Geoffrey Hinton , Oriol Vinyals , and Jeff Dean . 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 ( 2015 ). Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)."},{"key":"e_1_2_1_18_1","volume-title":"Salakhutdinov","author":"Hinton Geoffrey E.","year":"2012","unstructured":"Geoffrey E. Hinton , Nitish Srivastava , Alex Krizhevsky , Ilya Sutskever , and Ruslan R . Salakhutdinov . 2012 . Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580 (2012). Geoffrey E. Hinton, Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, and Ruslan R. Salakhutdinov. 2012. Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580 (2012)."},{"key":"e_1_2_1_19_1","volume-title":"International Conference on Learning Representations.","author":"Huang Gao","unstructured":"Gao Huang , Yixuan Li , Geoff Pleiss , Zhuang Liu , John E. Hopcroft , and Kilian Q. Weinberger . 2017. Snapshot ensembles: Train 1, get M for free . In International Conference on Learning Representations. Gao Huang, Yixuan Li, Geoff Pleiss, Zhuang Liu, John E. Hopcroft, and Kilian Q. Weinberger. 2017. Snapshot ensembles: Train 1, get M for free. In International Conference on Learning Representations."},{"key":"e_1_2_1_20_1","volume-title":"Weinberger","author":"Huang Gao","year":"2016","unstructured":"Gao Huang , Yu Sun , Zhuang Liu , Daniel Sedra , and Kilian Q . Weinberger . 2016 . Deep networks with stochastic depth. In European Conference on Computer Vision. Springer , 646\u2013661. Gao Huang, Yu Sun, Zhuang Liu, Daniel Sedra, and Kilian Q. Weinberger. 2016. Deep networks with stochastic depth. In European Conference on Computer Vision. Springer, 646\u2013661."},{"key":"e_1_2_1_21_1","volume-title":"Conference on Uncertainty in Artificial Intelligence (UAI).","author":"Izmailov Pavel","year":"2018","unstructured":"Pavel Izmailov , Dmitrii Podoprikhin , Timur Garipov , Dmitry Vetrov , and Andrew Gordon Wilson . 2018 . Averaging weights leads to wider optima and better generalization . In Conference on Uncertainty in Artificial Intelligence (UAI). Pavel Izmailov, Dmitrii Podoprikhin, Timur Garipov, Dmitry Vetrov, and Andrew Gordon Wilson. 2018. Averaging weights leads to wider optima and better generalization. In Conference on Uncertainty in Artificial Intelligence (UAI)."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2017.2701831"},{"key":"e_1_2_1_23_1","volume-title":"Bridgeout: Stochastic bridge regularization for deep neural networks. arXiv preprint arXiv:1804.08042","author":"Khan Najeeb","year":"2018","unstructured":"Najeeb Khan , Jawad Shah , and Ian Stavness . 2018 . Bridgeout: Stochastic bridge regularization for deep neural networks. arXiv preprint arXiv:1804.08042 (2018). Najeeb Khan, Jawad Shah, and Ian Stavness. 2018. Bridgeout: Stochastic bridge regularization for deep neural networks. arXiv preprint arXiv:1804.08042 (2018)."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCVW.2013.77"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.5555\/2998687.2998716"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1022859003006"},{"key":"e_1_2_1_27_1","volume-title":"International Conference on Learning Representations.","author":"Laine Samuli","year":"2017","unstructured":"Samuli Laine and Timo Aila . 2017 . Temporal ensembling for semi-supervised learning . In International Conference on Learning Representations. Samuli Laine and Timo Aila. 2017. Temporal ensembling for semi-supervised learning. In International Conference on Learning Representations."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.5555\/3104482.3104516"},{"key":"e_1_2_1_29_1","volume-title":"Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710","author":"Li Hao","year":"2016","unstructured":"Hao Li , Asim Kadav , Igor Durdanovic , Hanan Samet , and Hans Peter Graf . 2016. Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710 ( 2016 ). Hao Li, Asim Kadav, Igor Durdanovic, Hanan Samet, and Hans Peter Graf. 2016. Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710 (2016)."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-33460-3_27"},{"key":"e_1_2_1_31_1","volume-title":"35th International Conference on Machine Learning.","author":"Li Xuhong","year":"2018","unstructured":"Xuhong Li , Yves Grandvalet , and Franck Davoine . 2018 . Explicit inductive bias for transfer learning with convolutional networks . In 35th International Conference on Machine Learning. Xuhong Li, Yves Grandvalet, and Franck Davoine. 2018. Explicit inductive bias for transfer learning with convolutional networks. In 35th International Conference on Machine Learning."},{"key":"e_1_2_1_32_1","volume-title":"International Conference on Machine Learning. PMLR, 6010\u20136019","author":"Li Xingjian","year":"2020","unstructured":"Xingjian Li , Haoyi Xiong , Haozhe An , Cheng-Zhong Xu , and Dejing Dou . 2020 . RIFLE: Backpropagation in depth for deep transfer learning through Re-Initializing the Fully-connected LayEr . In International Conference on Machine Learning. PMLR, 6010\u20136019 . Xingjian Li, Haoyi Xiong, Haozhe An, Cheng-Zhong Xu, and Dejing Dou. 2020. RIFLE: Backpropagation in depth for deep transfer learning through Re-Initializing the Fully-connected LayEr. In International Conference on Machine Learning. PMLR, 6010\u20136019."},{"key":"e_1_2_1_33_1","volume-title":"International Conference on Learning Representations.","author":"Li Xingjian","year":"2019","unstructured":"Xingjian Li , Haoyi Xiong , Hanchao Wang , Yuxuan Rao , Liping Liu , and Jun Huan . 2019 . Delta: Deep learning transfer using feature map with attention for convolutional networks . In International Conference on Learning Representations. Xingjian Li, Haoyi Xiong, Hanchao Wang, Yuxuan Rao, Liping Liu, and Jun Huan. 2019. Delta: Deep learning transfer using feature map with attention for convolutional networks. In International Conference on Learning Representations."},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.5555\/3327757.3327910"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2017.2773081"},{"key":"e_1_2_1_36_1","volume-title":"International Conference on Learning Representations.","author":"Loshchilov Ilya","year":"2016","unstructured":"Ilya Loshchilov and Frank Hutter . 2016 . SGDR: Stochastic gradient descent with warm restarts . In International Conference on Learning Representations. Ilya Loshchilov and Frank Hutter. 2016. SGDR: Stochastic gradient descent with warm restarts. In International Conference on Learning Representations."},{"key":"e_1_2_1_37_1","unstructured":"S. Maji J. Kannala E. Rahtu M. Blaschko and A. Vedaldi. 2013. Fine-grained Visual Classification of Aircraft. arXiv preprint arXiv:1306.5151.  S. Maji J. Kannala E. Rahtu M. Blaschko and A. Vedaldi. 2013. Fine-grained Visual Classification of Aircraft. arXiv preprint arXiv:1306.5151."},{"key":"e_1_2_1_38_1","volume-title":"Magoulas","author":"Mosca Alan","year":"2017","unstructured":"Alan Mosca and George D . Magoulas . 2017 . Deep incremental boosting. arXiv preprint arXiv:1708.03704 (2017). Alan Mosca and George D. Magoulas. 2017. Deep incremental boosting. arXiv preprint arXiv:1708.03704 (2017)."},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.283"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICVGIP.2008.47"},{"key":"e_1_2_1_41_1","volume-title":"Cooper","author":"Perrone Michael P.","year":"1992","unstructured":"Michael P. Perrone and Leon N . Cooper . 1992 . When Networks Disagree: Ensemble Methods for Hybrid Neural Networks. Technical Report. Institute for Brain and Neural Systems, Brown University , Providence, RI. Michael P. Perrone and Leon N. Cooper. 1992. When Networks Disagree: Ensemble Methods for Hybrid Neural Networks. Technical Report. Institute for Brain and Neural Systems, Brown University, Providence, RI."},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.5555\/645530.655657"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.5555\/3157096.3157100"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.5555\/2627435.2670313"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.5555\/3294771.3294885"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.5555\/3042817.3043055"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2019.00068"},{"key":"e_1_2_1_48_1","volume-title":"Horizontal and vertical ensemble with deep representation for classification. arXiv preprint arXiv:1306.2759","author":"Xie Jingjing","year":"2013","unstructured":"Jingjing Xie , Bing Xu , and Zhang Chuang . 2013. Horizontal and vertical ensemble with deep representation for classification. arXiv preprint arXiv:1306.2759 ( 2013 ). Jingjing Xie, Bing Xu, and Zhang Chuang. 2013. Horizontal and vertical ensemble with deep representation for classification. arXiv preprint arXiv:1306.2759 (2013)."},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.754"},{"key":"e_1_2_1_50_1","volume-title":"Zeiler and Rob Fergus","author":"Matthew","year":"2014","unstructured":"Matthew D. Zeiler and Rob Fergus . 2014 . Visualizing and understanding convolutional networks. In European Conference on Computer Vision. Springer , 818\u2013833. Matthew D. Zeiler and Rob Fergus. 2014. Visualizing and understanding convolutional networks. In European Conference on Computer Vision. Springer, 818\u2013833."},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2020.3002614"}],"container-title":["ACM Transactions on Intelligent Systems and Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3473464","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3473464","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T21:28:15Z","timestamp":1750195695000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3473464"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,10,31]]},"references-count":51,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2021,10,31]]}},"alternative-id":["10.1145\/3473464"],"URL":"https:\/\/doi.org\/10.1145\/3473464","relation":{},"ISSN":["2157-6904","2157-6912"],"issn-type":[{"value":"2157-6904","type":"print"},{"value":"2157-6912","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,10,31]]},"assertion":[{"value":"2021-01-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-06-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-12-21","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}