{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T07:27:28Z","timestamp":1740122848036,"version":"3.37.3"},"reference-count":26,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2024,6,6]],"date-time":"2024-06-06T00:00:00Z","timestamp":1717632000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,6,6]],"date-time":"2024-06-06T00:00:00Z","timestamp":1717632000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61773330","61773330"],"award-info":[{"award-number":["61773330","61773330"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"National Key Research and Development Project of China","award":["2020YFA0713503"],"award-info":[{"award-number":["2020YFA0713503"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Neural Process Lett"],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Deep neural networks perform better than shallow neural networks, but the former tends to be deeper or wider, introducing large numbers of parameters and computations. We know that networks that are too wide have a high risk of overfitting and networks that are too deep require a large amount of computation. This paper proposed a narrow-deep ResNet, increasing the depth of the network while avoiding other issues caused by making the network too wide, and used the strategy of knowledge distillation, where we set up a trained teacher model to train an unmodified, wide, and narrow-deep ResNet that allows students to learn the teacher\u2019s output. To validate the effectiveness of this method, it is tested on Cifar-100 and Pascal VOC datasets. The method proposed in this paper allows a small model to have about the same accuracy rate as a large model, while dramatically shrinking the response time and computational effort.<\/jats:p>","DOI":"10.1007\/s11063-024-11646-5","type":"journal-article","created":{"date-parts":[[2024,6,6]],"date-time":"2024-06-06T03:49:42Z","timestamp":1717645782000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Knowledge Distillation Based on Narrow-Deep Networks"],"prefix":"10.1007","volume":"56","author":[{"given":"Yan","family":"Zhou","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhiqiang","family":"Wang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jianxun","family":"Li","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2024,6,6]]},"reference":[{"key":"11646_CR1","doi-asserted-by":"crossref","unstructured":"He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 770\u2013778","DOI":"10.1109\/CVPR.2016.90"},{"key":"11646_CR2","doi-asserted-by":"crossref","unstructured":"Zagoruyko S, Komodakis N (2016) Wide residual networks. arXiv preprint arXiv:1605.07146","DOI":"10.5244\/C.30.87"},{"key":"11646_CR3","unstructured":"Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861"},{"key":"11646_CR4","unstructured":"Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2016) Squeezenet: alexnet-level accuracy with 50x fewer parameters and< 0.5 mb model size. arXiv preprint arXiv:1602.07360"},{"key":"11646_CR5","unstructured":"Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861"},{"key":"11646_CR6","unstructured":"Han S, Mao H, Dally WJ (2015) Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149"},{"key":"11646_CR7","unstructured":"Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531"},{"key":"11646_CR8","unstructured":"Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, BengioY (2014) Fitnets: hints for thin deep nets. arXiv preprint arXiv:1412.6550"},{"key":"11646_CR9","doi-asserted-by":"crossref","unstructured":"Zhao B, Cui Q, Song R, Qiu Y, Liang J (2022) Decoupled knowledge distillation. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition. pp 11953\u201311962","DOI":"10.1109\/CVPR52688.2022.01165"},{"key":"11646_CR10","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/s11063-023-11328-8","volume":"55","author":"M Shao","year":"2023","unstructured":"Shao M, Li S, Peng Z, Sun Y (2023) Adversarial-based ensemble feature knowledge distillation. Neural Process Lett 55:1\u201315","journal-title":"Neural Process Lett"},{"key":"11646_CR11","doi-asserted-by":"crossref","unstructured":"Cho JH, Hariharan B (2019) On the efficacy of knowledge distillation. In: Proceedings of the IEEE\/CVF international conference on computer vision. pp 4794\u20134802","DOI":"10.1109\/ICCV.2019.00489"},{"key":"11646_CR12","first-page":"12084","volume":"35","author":"Y Chen","year":"2022","unstructured":"Chen Y, Wang S, Liu J, Xu X, Hoog F, Huang Z (2022) Improved feature distillation via projector ensemble. Adv Neural Inf Process Syst 35:12084\u201312095","journal-title":"Adv Neural Inf Process Syst"},{"key":"11646_CR13","doi-asserted-by":"crossref","unstructured":"Yang Z, Li Z, Shao M, Shi D, Yuan, Z, Yuan C (2022) Masked generative distillation. In: European conference on computer vision. Springer, pp 53\u201369","DOI":"10.1007\/978-3-031-20083-0_4"},{"key":"11646_CR14","first-page":"1504","volume":"37","author":"Z Li","year":"2023","unstructured":"Li Z, Li X, Yang L, Zhao B, Song R, Luo L, Li J, Yang J (2023) Curriculum temperature for knowledge distillation. Proc AAAI Conf Artif Intell 37:1504\u20131512","journal-title":"Proc AAAI Conf Artif Intell"},{"key":"11646_CR15","doi-asserted-by":"crossref","unstructured":"Chen X, Cao Q, Zhong Y, Zhang J, Gao S, Tao D (2022) Dearkd: data-efficient early knowledge distillation for vision transformers. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition. pp 12052\u201312062","DOI":"10.1109\/CVPR52688.2022.01174"},{"key":"11646_CR16","doi-asserted-by":"crossref","unstructured":"Bai Y, Wang Z, Xiao J, Wei C, Wang H, Yuille AL, Zhou Y, Xie C (2023) Masked autoencoders enable efficient knowledge distillers. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition. pp 24256\u201324265","DOI":"10.1109\/CVPR52729.2023.02323"},{"key":"11646_CR17","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s11063-023-11158-8","volume":"55","author":"L Li","year":"2023","unstructured":"Li L, Su W, Liu F, He M, Liang X (2023) Knowledge fusion distillation: improving distillation with multi-scale attention mechanisms. Neural Process Lett 55:1\u201316","journal-title":"Neural Process Lett"},{"issue":"3","key":"11646_CR18","doi-asserted-by":"publisher","first-page":"2613","DOI":"10.1007\/s11063-022-11038-7","volume":"55","author":"N Jiang","year":"2023","unstructured":"Jiang N, Tang J, Yu W (2023) Positive-unlabeled learning for knowledge distillation. Neural Process Lett 55(3):2613\u20132631","journal-title":"Neural Process Lett"},{"key":"11646_CR19","unstructured":"Yang Z, Li Z, Zeng A, Li Z, Yuan C, Li Y (2022) Vitkd: practical guidelines for vit feature knowledge distillation. arXiv preprint arXiv:2209.02432"},{"key":"11646_CR20","doi-asserted-by":"crossref","unstructured":"Yang C, Zhou H, An Z, Jiang X, Xu Y, Zhang Q (2022) Cross-image relational knowledge distillation for semantic segmentation. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition. pp 12319\u201312328","DOI":"10.1109\/CVPR52688.2022.01200"},{"key":"11646_CR21","doi-asserted-by":"publisher","DOI":"10.1109\/MIS.2022.3205686","author":"Y Li","year":"2022","unstructured":"Li Y, Gong Y, Zhang Z (2022) Few-shot object detection based on self-knowledge distillation. IEEE Intell Syst. https:\/\/doi.org\/10.1109\/MIS.2022.3205686","journal-title":"IEEE Intell Syst"},{"key":"11646_CR22","doi-asserted-by":"crossref","unstructured":"Zhang Y, Xiang T, Hospedales TM, Lu H (2018) Deep mutual learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 4320\u20134328","DOI":"10.1109\/CVPR.2018.00454"},{"key":"11646_CR23","first-page":"5191","volume":"34","author":"SI Mirzadeh","year":"2020","unstructured":"Mirzadeh SI, Farajtabar M, Li A, Levine N, Matsukawa A, Ghasemzadeh H (2020) Improved knowledge distillation via teacher assistant. Proc AAAI Conf Artif Intell 34:5191\u20135198","journal-title":"Proc AAAI Conf Artif Intell"},{"issue":"9","key":"11646_CR24","doi-asserted-by":"publisher","first-page":"251","DOI":"10.1007\/s00158-022-03350-6","volume":"65","author":"NA Khan","year":"2022","unstructured":"Khan NA, Sulaiman M, Alshammari FS (2022) Heat transfer analysis of an inclined longitudinal porous fin of trapezoidal, rectangular and dovetail profiles using cascade neural networks. Struct Multidiscipl Optim 65(9):251","journal-title":"Struct Multidiscipl Optim"},{"key":"11646_CR25","doi-asserted-by":"publisher","DOI":"10.1016\/j.compeleceng.2023.108740","volume":"109","author":"NA Khan","year":"2023","unstructured":"Khan NA, Laouini G, Alshammari FS, Khalid M, Aamir N (2023) Supervised machine learning for jamming transition in traffic flow with fluctuations in acceleration and braking. Comput Electr Eng 109:108740","journal-title":"Comput Electr Eng"},{"issue":"5","key":"11646_CR26","doi-asserted-by":"publisher","first-page":"1173","DOI":"10.3390\/math11051173","volume":"11","author":"M Sulaiman","year":"2023","unstructured":"Sulaiman M, Khan NA, Alshammari FS, Laouini G (2023) Performance of heat transfer in micropolar fluid with isothermal and isoflux boundary conditions using supervised neural networks. Mathematics 11(5):1173","journal-title":"Mathematics"}],"container-title":["Neural Processing Letters"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11063-024-11646-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11063-024-11646-5\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11063-024-11646-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,15]],"date-time":"2024-07-15T11:29:23Z","timestamp":1721042963000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11063-024-11646-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,6,6]]},"references-count":26,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2024,6]]}},"alternative-id":["11646"],"URL":"https:\/\/doi.org\/10.1007\/s11063-024-11646-5","relation":{},"ISSN":["1573-773X"],"issn-type":[{"type":"electronic","value":"1573-773X"}],"subject":[],"published":{"date-parts":[[2024,6,6]]},"assertion":[{"value":"8 May 2024","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 June 2024","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"197"}}