{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,20]],"date-time":"2026-02-20T18:15:34Z","timestamp":1771611334475,"version":"3.50.1"},"reference-count":31,"publisher":"Springer Science and Business Media LLC","issue":"1-2","license":[{"start":{"date-parts":[[2024,2,22]],"date-time":"2024-02-22T00:00:00Z","timestamp":1708560000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,2,22]],"date-time":"2024-02-22T00:00:00Z","timestamp":1708560000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100002347","name":"Bundesministerium f\u00fcr Bildung und Forschung","doi-asserted-by":"publisher","award":["01IS19070"],"award-info":[{"award-number":["01IS19070"]}],"id":[{"id":"10.13039\/501100002347","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001652","name":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100001652","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int J Parallel Prog"],"published-print":{"date-parts":[[2024,4]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Filter pruning of convolutional neural networks (CNNs) is a common technique to effectively reduce the memory footprint, the number of arithmetic operations, and, consequently, inference time. Recent pruning approaches also consider the targeted device (i.e., graphics processing units) for CNN deployment to reduce the actual inference time. However, simple metrics, such as the <jats:inline-formula><jats:alternatives><jats:tex-math>$$\\ell ^1$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:msup>\n                    <mml:mi>\u2113<\/mml:mi>\n                    <mml:mn>1<\/mml:mn>\n                  <\/mml:msup>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula>-norm, are used for deciding which filters to prune. In this work, we propose a hardware-aware technique to explore the vast multi-objective design space of possible filter pruning configurations. Our approach incorporates not only the targeted device but also techniques from explainable artificial intelligence for ranking and deciding which filters to prune. For each layer, the number of filters to be pruned is optimized with the objective of minimizing the inference time and the error rate of the CNN. Experimental results show that our approach can speed up inference time by 1.40\u00d7 and 1.30\u00d7 for VGG-16 on the CIFAR-10 dataset and ResNet-18 on the ILSVRC-2012 dataset, respectively, compared to the state-of-the-art ABCPruner.<\/jats:p>","DOI":"10.1007\/s10766-024-00760-5","type":"journal-article","created":{"date-parts":[[2024,2,22]],"date-time":"2024-02-22T16:02:33Z","timestamp":1708617753000},"page":"40-58","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Hardware-Aware Evolutionary Explainable Filter Pruning for Convolutional Neural Networks"],"prefix":"10.1007","volume":"52","author":[{"given":"Christian","family":"Heidorn","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2066-646X","authenticated-orcid":false,"given":"Muhammad","family":"Sabih","sequence":"additional","affiliation":[]},{"given":"Nicolai","family":"Meyerh\u00f6fer","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9830-2311","authenticated-orcid":false,"given":"Christian","family":"Schinabeck","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6285-5862","authenticated-orcid":false,"given":"J\u00fcrgen","family":"Teich","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3663-6484","authenticated-orcid":false,"given":"Frank","family":"Hannig","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,2,22]]},"reference":[{"key":"760_CR1","first-page":"241:1","volume":"22","author":"T Hoefler","year":"2021","unstructured":"Hoefler, T., et al.: Sparsity in deep learning: pruning and growth for efficient inference and training in neural networks. J. Mach. Learn. Res. 22, 241:1-241:124 (2021)","journal-title":"J. Mach. Learn. Res."},{"key":"760_CR2","doi-asserted-by":"publisher","unstructured":"Zhang, Y., et al.: Improvement of efficiency in evolutionary pruning . In: Proceedings of the International Joint Conference on Neural Networks (IJCNN), pp. 1\u20138. IEEE (2021). https:\/\/doi.org\/10.1109\/IJCNN52387.2021.9534055","DOI":"10.1109\/IJCNN52387.2021.9534055"},{"key":"760_CR3","doi-asserted-by":"publisher","unstructured":"Lin, M., et al.: Channel pruning via automatic structure search . In: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI), pp. 673\u2013679 (2020). https:\/\/doi.org\/10.24963\/ijcai.2020\/94","DOI":"10.24963\/ijcai.2020\/94"},{"issue":"3","key":"760_CR4","doi-asserted-by":"publisher","first-page":"1626","DOI":"10.1109\/TCYB.2019.2928174","volume":"51","author":"Y Zhou","year":"2021","unstructured":"Zhou, Y., Yen, G.G., Yi, Z.: A knee-guided evolutionary algorithm for compressing deep neural networks. IEEE Trans. Cybern. 51(3), 1626\u20131638 (2021). https:\/\/doi.org\/10.1109\/TCYB.2019.2928174","journal-title":"IEEE Trans. Cybern."},{"key":"760_CR5","doi-asserted-by":"publisher","unstructured":"Heidorn, C., et al.: Hardware-aware evolutionary filter pruning . In: Embedded Computer Systems: Architectures, Modeling, and Simulation\u201422nd International Conference, SAMOS 2022, Samos, Greece, July 3\u20137, 2022, Proceedings, vol. 13511. Lecture Notes in Computer Science, pp. 283\u2013299. Springer, (2022). https:\/\/doi.org\/10.1007\/978-3-031-15074-6_18","DOI":"10.1007\/978-3-031-15074-6_18"},{"issue":"10","key":"760_CR6","doi-asserted-by":"publisher","first-page":"2525","DOI":"10.1109\/TPAMI.2018.2858232","volume":"41","author":"J-H Luo","year":"2019","unstructured":"Luo, J.-H., et al.: ThiNet: pruning CNN filters for a thinner net. IEEE Trans. Pattern Anal. Mach. Intell. 41(10), 2525\u20132538 (2019). https:\/\/doi.org\/10.1109\/TPAMI.2018.2858232","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"issue":"2","key":"760_CR7","doi-asserted-by":"publisher","first-page":"182","DOI":"10.1109\/4235.996017","volume":"6","author":"K Deb","year":"2002","unstructured":"Deb, K., et al.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evolut. Comput. 6(2), 182\u2013197 (2002). https:\/\/doi.org\/10.1109\/4235.996017","journal-title":"IEEE Trans. Evolut. Comput."},{"key":"760_CR8","volume-title":"Learning Multiple Layers of Features from Tiny Images","author":"A Krizhevsky","year":"2012","unstructured":"Krizhevsky, A.: Learning Multiple Layers of Features from Tiny Images. University of Toronto (2012)"},{"key":"760_CR9","unstructured":"Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Proceedings of the 3rd International Conference on Learning Representations (ICLR) (2015)"},{"key":"760_CR10","unstructured":"Ma, X., et al.: Non-structured DNN weight pruning\u2014Is it beneficial in any platform? In: The Computing Research Repository (CoRR) (2020). arXiv: 1907.02124 [cs.LG]"},{"key":"760_CR11","unstructured":"Li, H., et al.: Pruning filters for efficient ConvNets . In: Proceedings of the 5th International Conference on Learning Representations (ICLR) (2017)"},{"key":"760_CR12","unstructured":"Molchanov, P., et al.: Pruning convolutional neural networks for resource efficient inference. In: Proceedings of the 5th International Conference on Learning Representations (ICLR) (2017)"},{"key":"760_CR13","unstructured":"Hu, H., et al.: Network trimming: a data-driven neuron pruning approach towards efficient deep architectures . In: The Computing Research Repository (CoRR) (2016). arXiv: 1607.03250 [cs.NE]"},{"key":"760_CR14","unstructured":"Shrikumar, A., Greenside, P, Kundaje, A.: Learning important features through propagating activation differences . In: Proceedings of the 34th International Conference on Machine Learning (ICML), vol. 70, pp. 3145\u20133153 (2017)"},{"key":"760_CR15","unstructured":"Sabih, M., Hannig, F., Teich, J.: Utilizing explainable AI for quantization and pruning of deep neural networks . In: The Computing Research Repository (CoRR) (2020). arXiv: 2008.09072 [cs.CV]"},{"key":"760_CR16","doi-asserted-by":"publisher","unstructured":"Sabih, M., Hannig, F., Teich, J.: DyFiP: explainable AI-based dynamic filter pruning of convolutional neural networks . In: Proceedings of the 2nd European Workshop on Machine Learning and Systems (EuroMLSys), pp. 109\u2013115. ACM (2022). https:\/\/doi.org\/10.1145\/3517207.3526982","DOI":"10.1145\/3517207.3526982"},{"key":"760_CR31","doi-asserted-by":"publisher","unstructured":"Muhammad Sabih et al.: MOSP: Multi-objective sensitivity pruning of deep neural networks. In: The 13th International Green and Sustainable Computing Conference (IGSC), pp 1\u20138. IEEE (2022). https:\/\/doi.org\/10.1109\/IGSC55832.2022.9969374","DOI":"10.1109\/IGSC55832.2022.9969374"},{"key":"760_CR17","doi-asserted-by":"publisher","unstructured":"Radu, V., et al.: Performance aware convolutional neural network channel pruning for embedded GPUs . In: Proceedings of the IEEE International Symposium on Workload Characterization (IISWC), pp. 24\u201334. IEEE (2019). https:\/\/doi.org\/10.1109\/IISWC47752.2019.9042000","DOI":"10.1109\/IISWC47752.2019.9042000"},{"key":"760_CR18","doi-asserted-by":"publisher","unstructured":"Li, X., et al.: Partial order pruning: for best speed\/accuracy trade-off in neural architecture search . In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9145\u20139153. IEEE (2019). https:\/\/doi.org\/10.1109\/CVPR.2019.00936","DOI":"10.1109\/CVPR.2019.00936"},{"key":"760_CR19","unstructured":"Shen, M., et al.: HALP: hardware-aware latency pruning . In: The Computing Research Repository (CoRR) (2021). arXiv: 2110.10811 [cs.CV]"},{"key":"760_CR20","unstructured":"Lin, L., Yang, Y., Guo, Z.: AACP: model compression by accurate and automatic channel pruning . In: The Computing Research Repository (CoRR) (2021). arXiv: 2102.00390 [cs.CV]"},{"key":"760_CR21","unstructured":"Han, S., et al.: Learning both weights and connections for efficient neural network . In: Proceedings of the Annual Conference on Neural Information Processing Systems, pp. 1135\u20131143 (2015)"},{"key":"760_CR22","doi-asserted-by":"publisher","unstructured":"Schuster, A., et al.: Design space exploration of time, energy, and error rate trade-offs for CNNs using accuracy-programmable instruction set processors . In: Proceedings of the 2nd International Work- shop on IoT, Edge, and Mobile for Embedded Machine Learning (ITEM), pp. 375\u2013389. Springer (2021). https:\/\/doi.org\/10.1007\/978-3-030-93736-2_29","DOI":"10.1007\/978-3-030-93736-2_29"},{"key":"760_CR23","unstructured":"Corp, N.: Nvidia TensorRT documentation. (2021). https:\/\/docs.nvidia.com\/deeplearning\/tensorrt\/developer-guide\/index.html. visited on 10 Apr 2021"},{"key":"760_CR24","doi-asserted-by":"publisher","unstructured":"He, K., et al.: Deep residual learning for image recognition . In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770\u2013778. IEEE (2016). https:\/\/doi.org\/10.1109\/CVPR.2016.90","DOI":"10.1109\/CVPR.2016.90"},{"key":"760_CR25","unstructured":"Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. In: The Computing Research Repository (CoRR) (2014). arXiv: 1409.0575 [cs.CV]"},{"key":"760_CR26","doi-asserted-by":"publisher","unstructured":"Lukasiewycz, M., et al.: Opt4J\u2014a modular framework for meta-heuristic optimization . In: Proceedings of the Genetic and Evolutionary Computing Conference (GECCO), pp. 1723\u20131730. ACM (2011). https:\/\/doi.org\/10.1145\/2001576.2001808","DOI":"10.1145\/2001576.2001808"},{"issue":"4","key":"760_CR27","doi-asserted-by":"publisher","first-page":"257","DOI":"10.1109\/4235.797969","volume":"3","author":"E Zitzler","year":"1999","unstructured":"Zitzler, E., Thiele, L.: Multiobjective evolutionary algorithms: a comparative case study and the strength pareto approach. IEEE Trans. Evolut. Comput. 3(4), 257\u2013271 (1999). https:\/\/doi.org\/10.1109\/4235.797969","journal-title":"IEEE Trans. Evolut. Comput."},{"key":"760_CR28","doi-asserted-by":"publisher","first-page":"110074","DOI":"10.1109\/ACCESS.2021.3101936","volume":"9","author":"M Lechner","year":"2021","unstructured":"Lechner, M., Jantsch, A.: Blackthorn: latency estimation framework for CNNs on embedded Nvidia platforms. IEEE Access 9, 110074\u2013110084 (2021). https:\/\/doi.org\/10.1109\/ACCESS.2021.3101936","journal-title":"IEEE Access"},{"key":"760_CR29","doi-asserted-by":"publisher","unstructured":"Plagwitz, P., et al.: A Safari through FPGA-based neural network compilation and design automation flows. In: Proceedings of the 29th IEEE International Symposium on Field-Programmable Custom Com- puting Machines (FCCM), pp. 10\u201319. IEEE (2021). https:\/\/doi.org\/10.1109\/FCCM51124.2021.00010","DOI":"10.1109\/FCCM51124.2021.00010"},{"key":"760_CR30","doi-asserted-by":"publisher","unstructured":"Heidorn, C., Hannig, F., Teich, J.: Design space exploration for layer-parallel execution of convolutional neural networks on CGRAs . In: Proceedings of the 23rd International Workshop on Software and Compilers for Embedded Systems (SCOPES), pp. 26\u201331. ACM (2020). https:\/\/doi.org\/10.1145\/3378678.3391878","DOI":"10.1145\/3378678.3391878"}],"container-title":["International Journal of Parallel Programming"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10766-024-00760-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10766-024-00760-5\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10766-024-00760-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,3,29]],"date-time":"2024-03-29T13:05:58Z","timestamp":1711717558000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10766-024-00760-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,2,22]]},"references-count":31,"journal-issue":{"issue":"1-2","published-print":{"date-parts":[[2024,4]]}},"alternative-id":["760"],"URL":"https:\/\/doi.org\/10.1007\/s10766-024-00760-5","relation":{},"ISSN":["0885-7458","1573-7640"],"issn-type":[{"value":"0885-7458","type":"print"},{"value":"1573-7640","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,2,22]]},"assertion":[{"value":"28 January 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"19 January 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 February 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors report no conflicts of interest. The funders had no role in the design of the study; collection, analysis, or interpretation of the data; in writing the manuscript, or in the decision to publish the results.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}