{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,30]],"date-time":"2026-06-30T20:52:21Z","timestamp":1782852741592,"version":"3.54.5"},"reference-count":46,"publisher":"Springer Science and Business Media LLC","issue":"17-18","license":[{"start":{"date-parts":[[2024,6,19]],"date-time":"2024-06-19T00:00:00Z","timestamp":1718755200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,6,19]],"date-time":"2024-06-19T00:00:00Z","timestamp":1718755200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100003725","name":"National research foundation of korea","doi-asserted-by":"crossref","award":["NRF-2022R1A4A5034121"],"award-info":[{"award-number":["NRF-2022R1A4A5034121"]}],"id":[{"id":"10.13039\/501100003725","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100014188","name":"MSIT","doi-asserted-by":"crossref","award":["IITP-2024-RS-2023-00260175"],"award-info":[{"award-number":["IITP-2024-RS-2023-00260175"]}],"id":[{"id":"10.13039\/501100014188","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Appl Intell"],"published-print":{"date-parts":[[2024,9]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Several studies have attempted to solve traveling salesman problems (TSPs) using various deep learning techniques. Among them, Transformer-based models show state-of-the-art performance even for large-scale Traveling Salesman Problems (TSPs). However, they are based on fully-connected attention models and suffer from large computational complexity and GPU memory usage. Our work is the first CNN-Transformer model based on a CNN embedding layer and partial self-attention for TSP. Our CNN-Transformer model is able to better learn spatial features from input data using a CNN embedding layer compared with the standard Transformer-based models. It also removes considerable redundancy in fully-connected attention models using the proposed partial self-attention. Experimental results show that the proposed CNN embedding layer and partial self-attention are very effective in improving performance and computational complexity. The proposed model exhibits the best performance in real-world datasets and outperforms other existing state-of-the-art (SOTA) Transformer-based models in various aspects. Our code is publicly available at <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/github.com\/cm8908\/CNN_Transformer3\">https:\/\/github.com\/cm8908\/CNN_Transformer3<\/jats:ext-link>.<\/jats:p>","DOI":"10.1007\/s10489-024-05603-x","type":"journal-article","created":{"date-parts":[[2024,6,19]],"date-time":"2024-06-19T07:03:42Z","timestamp":1718780622000},"page":"7982-7993","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":20,"title":["A lightweight CNN-transformer model for learning traveling salesman problems"],"prefix":"10.1007","volume":"54","author":[{"given":"Minseop","family":"Jung","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jaeseung","family":"Lee","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7172-5039","authenticated-orcid":false,"given":"Jibum","family":"Kim","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2024,6,19]]},"reference":[{"issue":"3","key":"5603_CR1","doi-asserted-by":"publisher","first-page":"237","DOI":"10.1016\/0304-3975(77)90012-3","volume":"4","author":"CH Papadimitriou","year":"1977","unstructured":"Papadimitriou CH (1977) The euclidean travelling salesman problem is np-complete. Theoretical Comput Sci 4(3):237\u2013244","journal-title":"Theoretical Comput Sci"},{"key":"5603_CR2","unstructured":"Christofides N (1976) Worst-case analysis of a new heuristic for the travelling salesman problem. Technical Report"},{"key":"5603_CR3","unstructured":"Perron L, Furnon V (2022) Or-tools. https:\/\/developers.google.com\/optimization\/, 2022-11-25"},{"key":"5603_CR4","unstructured":"Kool W, Van\u00a0Hoof H, Welling M (2018) Attention, learn to solve routing problems! arXiv:1803.08475"},{"key":"5603_CR5","unstructured":"Bresson X, Laurent T (2021) The transformer network for the traveling salesman problem. arXiv:2103.03012"},{"key":"5603_CR6","unstructured":"Applegate D, Bixby R, Chv\u00e1tal V et\u00a0al (2006) Concorde tsp solver. https:\/\/www.math.uwaterloo.ca\/tsp\/concorde\/"},{"key":"5603_CR7","unstructured":"Vinyals O, Fortunato M, Jaitly N (2015) Pointer networks. Adv Neural Inf Process Syst 28"},{"key":"5603_CR8","unstructured":"Gurobi Optimization, LLC (2023) Gurobi Optimizer Reference Manual. https:\/\/www.gurobi.com"},{"key":"5603_CR9","unstructured":"Bello I, Pham H, Le QV et\u00a0al (2016) Neural combinatorial optimization with reinforcement learning. arXiv:1611.09940"},{"key":"5603_CR10","unstructured":"Nazari M, Oroojlooy A, Snyder L et\u00a0al (2018) Reinforcement learning for solving the vehicle routing problem. Adv Neural Inf Process Syst 31"},{"key":"5603_CR11","unstructured":"Joshi CK, Laurent T, Bresson X (2019) An efficient graph convolutional network technique for the travelling salesman problem. arXiv:1906.01227"},{"issue":"12","key":"5603_CR12","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0260995","volume":"16","author":"A Stohy","year":"2021","unstructured":"Stohy A, Abdelhakam HT, Ali S et al (2021) Hybrid pointer networks for traveling salesman problems optimization. Plos one 16(12):e0260995","journal-title":"Plos one"},{"key":"5603_CR13","unstructured":"Ma Q, Ge S, He D et\u00a0al (2019) Combinatorial optimization by graph pointer networks and hierarchical reinforcement learning. arXiv:1911.04936"},{"key":"5603_CR14","doi-asserted-by":"crossref","unstructured":"Miki S, Ebara H (2019) Solving traveling salesman problem with image-based classification. In: 2019 IEEE 31st International conference on tools with artificial intelligence (ICTAI). IEEE, pp 1118\u20131123","DOI":"10.1109\/ICTAI.2019.00156"},{"issue":"12","key":"5603_CR15","doi-asserted-by":"publisher","first-page":"7475","DOI":"10.1109\/TSMC.2020.2969317","volume":"51","author":"Z Ling","year":"2020","unstructured":"Ling Z, Tao X, Zhang Y et al (2020) Solving optimization problems through fully convolutional networks: An application to the traveling salesman problem. IEEE Trans Syst, Man, Cybernetics: Syst 51(12):7475\u20137485","journal-title":"IEEE Trans Syst, Man, Cybernetics: Syst"},{"key":"5603_CR16","doi-asserted-by":"crossref","unstructured":"Sultana N, Chan J, Sarwar T et\u00a0al (2022) Learning to optimise general tsp instances. Int J Machine Learn Cybernetics pp 1\u201316","DOI":"10.1007\/s13042-022-01516-8"},{"key":"5603_CR17","unstructured":"Vaswani A, Shazeer N, Parmar N et\u00a0al (2017) Attention is all you need. Adv Neural Inf Process Syst 30"},{"key":"5603_CR18","doi-asserted-by":"crossref","unstructured":"Dai Z, Yang Z, Yang Y et\u00a0al (2019) Transformer-xl: Attentive language models beyond a fixed-length context. arXiv:1901.02860","DOI":"10.18653\/v1\/P19-1285"},{"key":"5603_CR19","unstructured":"Devlin J, Chang MW, Lee K et\u00a0al (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805"},{"key":"5603_CR20","unstructured":"Dosovitskiy A, Beyer L, Kolesnikov A et\u00a0al (2020) An image is worth 16x16 words: Transformers Image Recognition at Scale. arXiv:2010.11929"},{"key":"5603_CR21","doi-asserted-by":"crossref","unstructured":"Deudon M, Cournut P, Lacoste A et\u00a0al (2018) Learning heuristics for the tsp by policy gradient. In: International conference on the integration of constraint programming, artificial intelligence, and operations research. Springer, pp 170\u2013181","DOI":"10.1007\/978-3-319-93031-2_12"},{"key":"5603_CR22","doi-asserted-by":"crossref","unstructured":"Wu Y, Song W, Cao Z et\u00a0al (2021) Learning improvement heuristics for solving routing problems. IEEE Trans Neural Netw Learn Syst","DOI":"10.1109\/TNNLS.2021.3068828"},{"key":"5603_CR23","first-page":"21188","volume":"33","author":"YD Kwon","year":"2020","unstructured":"Kwon YD, Choo J, Kim B et al (2020) Pomo: Policy optimization with multiple optima for reinforcement learning. Adv Neural Inf Process Syst 33:21188\u201321198","journal-title":"Adv Neural Inf Process Syst"},{"key":"5603_CR24","unstructured":"Goh YL, Lee WS, Bresson X et\u00a0al (2022) Combining reinforcement learning and optimal transport for the traveling salesman problem. arXiv:2203.00903"},{"key":"5603_CR25","doi-asserted-by":"publisher","first-page":"589","DOI":"10.1016\/j.neunet.2023.02.014","volume":"161","author":"H Yang","year":"2023","unstructured":"Yang H, Zhao M, Yuan L et al (2023) Memory-efficient transformer-based network model for traveling salesman problem. Neural Netw 161:589\u2013597","journal-title":"Neural Netw"},{"key":"5603_CR26","doi-asserted-by":"crossref","unstructured":"Guo Q, Qiu X, Liu P et\u00a0al (2019) Star-transformer. arXiv:1902.09113","DOI":"10.18653\/v1\/N19-1133"},{"key":"5603_CR27","unstructured":"Beltagy I, Peters ME, Cohan A (2020) Longformer: The long-document transformer. arXiv:2004.05150"},{"key":"5603_CR28","unstructured":"Wang S, Li BZ, Khabsa M et\u00a0al (2020) Linformer: Self-attention with linear complexity. arXiv:2006.04768"},{"key":"5603_CR29","doi-asserted-by":"crossref","unstructured":"Zhou H, Zhang S, Peng J et\u00a0al (2021) Informer: Beyond efficient transformer for long sequence time-series forecasting. In: Proceedings of the AAAI conference on artificial intelligence, pp 11106\u201311115","DOI":"10.1609\/aaai.v35i12.17325"},{"key":"5603_CR30","doi-asserted-by":"crossref","unstructured":"Pan X, Jin Y, Ding Y et\u00a0al (2023) H-tsp: Hierarchically solving the large-scale traveling salesman problem. In: AAAI 2023, https:\/\/www.microsoft.com\/en-us\/research\/publication\/h-tsp-hierarchically-solving-the-large-scale-traveling-salesman-problem\/","DOI":"10.1609\/aaai.v37i8.26120"},{"key":"5603_CR31","doi-asserted-by":"publisher","DOI":"10.2139\/ssrn.4643703","author":"H Ren","year":"2023","unstructured":"Ren H, Zhong R, Gui H (2023) A self-comparison based reinforcement learning method for dynamic traveling salesman problem. Available at SSRN. https:\/\/doi.org\/10.2139\/ssrn.4643703","journal-title":"Available at SSRN"},{"key":"5603_CR32","doi-asserted-by":"crossref","unstructured":"He K, Zhang X, Ren S et\u00a0al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770\u2013778","DOI":"10.1109\/CVPR.2016.90"},{"key":"5603_CR33","unstructured":"Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning. PMLR, pp 448\u2013456"},{"issue":"4","key":"5603_CR34","doi-asserted-by":"publisher","first-page":"376","DOI":"10.1287\/ijoc.3.4.376","volume":"3","author":"G Reinelt","year":"1991","unstructured":"Reinelt G (1991) Tsplib-a traveling salesman problem library. ORSA J Comput 3(4):376\u2013384","journal-title":"ORSA J Comput"},{"issue":"1\u20132","key":"5603_CR35","doi-asserted-by":"publisher","first-page":"349","DOI":"10.1016\/S0004-3702(96)00030-6","volume":"88","author":"IP Gent","year":"1996","unstructured":"Gent IP, Walsh T (1996) The tsp phase transition. Artif Intell 88(1\u20132):349\u2013358","journal-title":"Artif Intell"},{"key":"5603_CR36","doi-asserted-by":"publisher","first-page":"268","DOI":"10.1016\/j.asoc.2018.07.010","volume":"71","author":"M C\u00e1rdenas-Montes","year":"2018","unstructured":"C\u00e1rdenas-Montes M (2018) Creating hard-to-solve instances of travelling salesman problem. Appl Soft Comput 71:268\u2013276","journal-title":"Appl Soft Comput"},{"key":"5603_CR37","unstructured":"Lowerre BT (1976) The harpy speech recognition system[ph. d. thesis]. Carnegie-Mellon University"},{"key":"5603_CR38","doi-asserted-by":"crossref","unstructured":"Croes GA (1958) A method for solving traveling-salesman problems. Operations Res 6(6):791\u2013812. http:\/\/www.jstor.org\/stable\/167074","DOI":"10.1287\/opre.6.6.791"},{"key":"5603_CR39","doi-asserted-by":"publisher","first-page":"282","DOI":"10.1007\/11871842_29","volume-title":"Machine Learning: ECML 2006","author":"L Kocsis","year":"2006","unstructured":"Kocsis L, Szepesv\u00e1ri C (2006) Bandit based monte-carlo planning. Machine Learning: ECML 2006. Springer, Berlin Heidelberg, Berlin, Heidelberg, pp 282\u2013293"},{"key":"5603_CR40","doi-asserted-by":"publisher","first-page":"108418","DOI":"10.1109\/ACCESS.2020.3000236","volume":"8","author":"Z Xing","year":"2020","unstructured":"Xing Z, Tu S (2020) A graph neural network assisted monte carlo tree search approach to traveling salesman problem. IEEE Access 8:108418\u2013108428","journal-title":"IEEE Access"},{"key":"5603_CR41","unstructured":"Xing Z, Tu S, Xu L (2020) Solve traveling salesman problem by monte carlo tree search and deep neural network. arXiv:2005.06879"},{"key":"5603_CR42","unstructured":"Mehta S, Ghazvininejad M, Iyer S et\u00a0al (2020) Delight: Deep and light-weight transformer. arXiv:2008.00623"},{"key":"5603_CR43","unstructured":"Howard AG, Zhu M, Chen B et\u00a0al (2017) Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861"},{"key":"5603_CR44","doi-asserted-by":"crossref","unstructured":"Sandler M, Howard A, Zhu M et\u00a0al (2018) Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4510\u20134520","DOI":"10.1109\/CVPR.2018.00474"},{"key":"5603_CR45","doi-asserted-by":"crossref","unstructured":"Zhang X, Zhou X, Lin M et\u00a0al (2018) Shufflenet: An extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6848\u20136856","DOI":"10.1109\/CVPR.2018.00716"},{"key":"5603_CR46","doi-asserted-by":"crossref","unstructured":"Ma N, Zhang X, Zheng HT et\u00a0al (2018) Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European conference on computer vision (ECCV), pp 116\u2013131","DOI":"10.1007\/978-3-030-01264-9_8"}],"container-title":["Applied Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10489-024-05603-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10489-024-05603-x\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10489-024-05603-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,7]],"date-time":"2024-08-07T12:25:54Z","timestamp":1723033554000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10489-024-05603-x"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,6,19]]},"references-count":46,"journal-issue":{"issue":"17-18","published-print":{"date-parts":[[2024,9]]}},"alternative-id":["5603"],"URL":"https:\/\/doi.org\/10.1007\/s10489-024-05603-x","relation":{},"ISSN":["0924-669X","1573-7497"],"issn-type":[{"value":"0924-669X","type":"print"},{"value":"1573-7497","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,6,19]]},"assertion":[{"value":"9 June 2024","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"19 June 2024","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no conflicts of interest on this work.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interests"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethical and informed consent for data used"}}]}}