{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,29]],"date-time":"2025-11-29T05:07:10Z","timestamp":1764392830010,"version":"3.46.0"},"reference-count":23,"publisher":"Springer Science and Business Media LLC","issue":"6","license":[{"start":{"date-parts":[[2025,10,13]],"date-time":"2025-10-13T00:00:00Z","timestamp":1760313600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,10,13]],"date-time":"2025-10-13T00:00:00Z","timestamp":1760313600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Neural Process Lett"],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>\n                    Graph substitution is a key optimization technique used in deep learning frameworks. Traditional search-based methods are one way to address the problem of graph substitution. However, with the ongoing expansion of deep neural networks (DNNs), the exploration of their vast equivalent graph search space becomes increasingly time-consuming. In this paper, we propose two heuristic methods to accelerate the search process in graph substitution, offering a relatively novel direction compared to existing methods. The first method employs a Memory-Augmented heuristic to optimize computation graphs. To further enhance the efficiency of computation graph optimization, the second method uses the simulated annealing method. This method adds computation graphs with degraded performance into the candidate set with a certain probability. The experimental results show that without significant compromise of inference performance, these two methods can find graph substitutions delivering similar DNN computing performance compared to existing searching methods, while the overall searching time can be reduced from hours to seconds. The source code is available at\n                    <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" xlink:href=\"https:\/\/github.com\/hudevictor\/MAS-SAS\" ext-link-type=\"uri\">https:\/\/github.com\/hudevictor\/MAS-SAS<\/jats:ext-link>\n                    .\n                  <\/jats:p>","DOI":"10.1007\/s11063-025-11806-1","type":"journal-article","created":{"date-parts":[[2025,10,13]],"date-time":"2025-10-13T04:11:37Z","timestamp":1760328697000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Accelerating graph substitutions in DNN optimization by heuristic algorithms"],"prefix":"10.1007","volume":"57","author":[{"given":"Chun","family":"Hu","sequence":"first","affiliation":[]},{"given":"Yuxin","family":"He","sequence":"additional","affiliation":[]},{"given":"Yufan","family":"Huang","sequence":"additional","affiliation":[]},{"given":"Junhui","family":"He","sequence":"additional","affiliation":[]},{"given":"Mengting","family":"Yuan","sequence":"additional","affiliation":[]},{"given":"Qingan","family":"Li","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,10,13]]},"reference":[{"issue":"21","key":"11806_CR1","doi-asserted-by":"publisher","first-page":"25787","DOI":"10.1007\/s10489-023-04682-6","volume":"53","author":"SP Ang","year":"2023","unstructured":"Ang SP, Phung SL, Duong ST, Bouzerdoum A (2023) Msd-nas: multi-scale dense neural architecture search for real-time pedestrian lane detection. Appl Intell 53(21):25787\u201325801","journal-title":"Appl Intell"},{"issue":"24","key":"11806_CR2","doi-asserted-by":"publisher","first-page":"30482","DOI":"10.1007\/s10489-023-05162-7","volume":"53","author":"N Thierry","year":"2023","unstructured":"Thierry N, Bao B-K, Ali Z, Tan Z, Christ Chatelain IB, Kefalas P (2023) Prm-kged: paper recommender model using knowledge graph embedding and deep neural network. Appl Intell 53(24):30482\u201330496","journal-title":"Appl Intell"},{"key":"11806_CR3","first-page":"255","volume":"3","author":"Y Yang","year":"2021","unstructured":"Yang Y, Phothilimthana P, Wang Y, Willsey M, Roy S, Pienaar J (2021) Equality saturation for tensor graph superoptimization. Proc Mach Learn Syst 3:255\u2013268","journal-title":"Proc Mach Learn Syst"},{"key":"11806_CR4","doi-asserted-by":"crossref","unstructured":"Li J, Peng M, Li Q, Peng M, Yuan M (2022) Glite: A fast and efficient automatic graph-level optimizer for large-scale dnns. In: Proceedings of the 59th ACM\/IEEE Design Automation Conference, pp. 199\u2013204","DOI":"10.1145\/3489517.3530418"},{"key":"11806_CR5","doi-asserted-by":"publisher","first-page":"900","DOI":"10.1016\/j.ins.2022.12.033","volume":"623","author":"W Liang","year":"2023","unstructured":"Liang W, Dong W, Yuan M (2023) Slf: a passive parallelization of subgraph isomorphism. Inf Sci 623:900\u2013914","journal-title":"Inf Sci"},{"key":"11806_CR6","unstructured":"Chen T, Moreau T, Jiang Z, Zheng L, Yan E, Shen H, Cowan M, Wang L, Hu Y, Ceze L (2018) $$\\{$$TVM$$\\}$$: An automated $$\\{$$End-to-End$$\\}$$ optimizing compiler for deep learning. In: 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), pp. 578\u2013594"},{"key":"11806_CR7","first-page":"27","volume":"1","author":"Z Jia","year":"2019","unstructured":"Jia Z, Thomas J, Warszawski T, Gao M, Zaharia M, Aiken A (2019) Optimizing dnn computation with relaxed graph substitutions. Proc Mach Learn Syst 1:27\u201339","journal-title":"Proc Mach Learn Syst"},{"key":"11806_CR8","doi-asserted-by":"crossref","unstructured":"Jia Z, Padon O, Thomas J, Warszawski T, Zaharia M, Aiken A (2019) Taso: optimizing deep learning computation with automatic generation of graph substitutions. In: Proceedings of the 27th ACM Symposium on Operating Systems Principles, pp. 47\u201362","DOI":"10.1145\/3341301.3359630"},{"key":"11806_CR9","unstructured":"Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M (2016) $$\\{$$TensorFlow$$\\}$$: a system for $$\\{$$Large-Scale$$\\}$$ machine learning. In: 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), pp. 265\u2013283"},{"key":"11806_CR10","unstructured":"Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, et\u00a0al (2019) Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32"},{"issue":"12","key":"11806_CR11","doi-asserted-by":"publisher","first-page":"2734","DOI":"10.14778\/3407790.3407857","volume":"13","author":"J Fang","year":"2020","unstructured":"Fang J, Shen Y, Wang Y, Chen L (2020) Optimizing dnn computation graph using graph substitutions. Proc VLDB Endow 13(12):2734\u20132746","journal-title":"Proc VLDB Endow"},{"issue":"2","key":"11806_CR12","doi-asserted-by":"publisher","first-page":"463","DOI":"10.1109\/TPDS.2022.3223068","volume":"34","author":"S Zhang","year":"2022","unstructured":"Zhang S, Jiang Z, Hou X, Li M, Yuan M, You H (2022) Drone: an efficient distributed subgraph-centric framework for processing large-scale power-law graphs. IEEE Trans Parallel Distrib Syst 34(2):463\u2013474","journal-title":"IEEE Trans Parallel Distrib Syst"},{"issue":"5","key":"11806_CR13","doi-asserted-by":"publisher","first-page":"733","DOI":"10.3390\/e25050733","volume":"25","author":"V Laveglia","year":"2023","unstructured":"Laveglia V, Trentin E (2023) Downward-growing neural networks. Entropy 25(5):733","journal-title":"Entropy"},{"issue":"7","key":"11806_CR14","doi-asserted-by":"publisher","first-page":"7546","DOI":"10.1007\/s10489-022-03872-y","volume":"53","author":"E Tagne Fute","year":"2023","unstructured":"Tagne Fute E, Nyabeye Pangop D-K, Tonye E (2023) A new hybrid localization approach in wireless sensor networks based on particle swarm optimization and tabu search. Appl Intell 53(7):7546\u20137561","journal-title":"Appl Intell"},{"issue":"2","key":"11806_CR15","doi-asserted-by":"publisher","first-page":"2092","DOI":"10.1007\/s10489-021-02369-4","volume":"52","author":"VK Chennuru","year":"2022","unstructured":"Chennuru VK, Timmappareddy SR (2022) Simulated annealing based undersampling (saus): a hybrid multi-objective optimization method to tackle class imbalance. Appl Intell 52(2):2092\u20132110","journal-title":"Appl Intell"},{"issue":"1","key":"11806_CR16","doi-asserted-by":"publisher","first-page":"64","DOI":"10.1007\/s00453-023-01135-x","volume":"86","author":"B Doerr","year":"2024","unstructured":"Doerr B, Rajabi A, Witt C (2024) Simulated annealing is a polynomial-time approximation scheme for the minimum spanning tree problem. Algorithmica 86(1):64\u201389","journal-title":"Algorithmica"},{"key":"11806_CR17","doi-asserted-by":"crossref","unstructured":"Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818\u20132826","DOI":"10.1109\/CVPR.2016.308"},{"key":"11806_CR18","doi-asserted-by":"crossref","unstructured":"Zoph B, Vasudevan V, Shlens J, Le QV (2018) Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8697\u20138710","DOI":"10.1109\/CVPR.2018.00907"},{"key":"11806_CR19","doi-asserted-by":"crossref","unstructured":"He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770\u2013778","DOI":"10.1109\/CVPR.2016.90"},{"key":"11806_CR20","doi-asserted-by":"crossref","unstructured":"Xie S, Girshick R, Doll\u00e1r P, Tu Z, He K (2017) Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1492\u20131500","DOI":"10.1109\/CVPR.2017.634"},{"key":"11806_CR21","doi-asserted-by":"crossref","unstructured":"Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, (long and Short Papers), 1:4171\u20134186","DOI":"10.18653\/v1\/N19-1423"},{"key":"11806_CR22","unstructured":"Zoph B, Le QV (2016) Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578"},{"key":"11806_CR23","unstructured":"Pham H, Guan M, Zoph B, Le Q, Dean J (2018) Efficient neural architecture search via parameters sharing. In: International Conference on Machine Learning, 4095\u20134104 . PMLR"}],"container-title":["Neural Processing Letters"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11063-025-11806-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11063-025-11806-1\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11063-025-11806-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,11,29]],"date-time":"2025-11-29T05:02:33Z","timestamp":1764392553000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11063-025-11806-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,13]]},"references-count":23,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["11806"],"URL":"https:\/\/doi.org\/10.1007\/s11063-025-11806-1","relation":{},"ISSN":["1573-773X"],"issn-type":[{"type":"electronic","value":"1573-773X"}],"subject":[],"published":{"date-parts":[[2025,10,13]]},"assertion":[{"value":"27 August 2025","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"13 October 2025","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing Interests"}},{"value":"The data used are all publicly available and have no ethical violations.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethical and Informed Consent for Data Used"}}],"article-number":"86"}}