{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,4]],"date-time":"2026-06-04T03:43:49Z","timestamp":1780544629519,"version":"3.54.1"},"reference-count":25,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2025,10,13]],"date-time":"2025-10-13T00:00:00Z","timestamp":1760313600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Artif. Intell."],"abstract":"<jats:p>While the transformer architecture has demonstrated strong success in natural language processing and computer vision, its application to limit order book forecasting, particularly in capturing spatial and temporal dependencies, remains limited. In this work, we introduce Limit Order Book Transformer (LiT), a novel deep learning architecture for forecasting short-term market movements using high-frequency limit order book data. Unlike previous approaches that rely on convolutional layers, LiT leverages structured patches and transformer-based self-attention to model spatial and temporal features in market microstructure dynamics. We evaluate LiT on multiple LOB datasets across different prediction horizons, LiT consistently outperforms traditional machine learning methods and state-of-the-art deep learning baselines. Furthermore, we show that LiT maintains robust performance under distributional shifts via fine-tuning, making it a practical solution for fast-paced and dynamic financial environments.<\/jats:p>","DOI":"10.3389\/frai.2025.1616485","type":"journal-article","created":{"date-parts":[[2025,10,13]],"date-time":"2025-10-13T10:35:53Z","timestamp":1760351753000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["LiT: limit order book transformer"],"prefix":"10.3389","volume":"8","author":[{"given":"Yue","family":"Xiao","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Carmine","family":"Ventre","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yuhan","family":"Wang","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Haochen","family":"Li","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yuxi","family":"Huan","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Buhong","family":"Liu","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1965","published-online":{"date-parts":[[2025,10,13]]},"reference":[{"key":"B1","author":"Abadi","year":"2015","journal-title":"TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems"},{"key":"B2","doi-asserted-by":"publisher","first-page":"106","DOI":"10.1109\/UKSim.2014.67","article-title":"\u201cStock price prediction using the ARIMA model,\u201d","author":"Ariyo","year":"2014"},{"key":"B3","doi-asserted-by":"publisher","first-page":"35","DOI":"10.1080\/14697688.2023.2286351","article-title":"Deep attentive survival analysis in limit order books: estimating fill probabilities with convolutional-transformers","volume":"24","author":"Arroyo","year":"2024","journal-title":"Quant. Finance"},{"key":"B4","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1409.0473","article-title":"Neural machine translation by jointly learning to align and translate","author":"Bahdanau","year":"2014","journal-title":"arXiv preprint arXiv:1409.0473"},{"key":"B5","unstructured":"Chollet\n              F.\n            \n          \n          Keras\n          \n          2015"},{"key":"B6","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2010.11929","article-title":"An image is worth 16 \u00d7 16 words: transformers for image recognition at scale","author":"Dosovitskiy","year":"2020","journal-title":"arXiv preprint arXiv:2010.11929"},{"key":"B7","doi-asserted-by":"publisher","first-page":"78","DOI":"10.1080\/1351847X.2021.1908390","article-title":"Ascertaining price formation in cryptocurrency markets with machine learning","volume":"30","author":"Fang","year":"2021","journal-title":"Eur. J. Finance"},{"key":"B8","doi-asserted-by":"publisher","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long short-term memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput"},{"key":"B9","doi-asserted-by":"publisher","first-page":"1315","DOI":"10.1080\/14697688.2015.1032546","article-title":"Modelling high-frequency limit order book dynamics with support vector machines","volume":"15","author":"Kercheval","year":"2015","journal-title":"Quant. Finance"},{"key":"B10","doi-asserted-by":"publisher","year":"2025","journal-title":"King's Computational Research, Engineering and Technology Environment (CREATE)","DOI":"10.18742\/rnvf-m076"},{"key":"B11","doi-asserted-by":"publisher","first-page":"689","DOI":"10.1016\/j.ejor.2016.10.031","article-title":"Deep neural networks, gradient-boosted trees, random forests: statistical arbitrage on the s&p 500","volume":"259","author":"Krauss","year":"2017","journal-title":"Eur. J. Oper. Res"},{"key":"B12","doi-asserted-by":"publisher","first-page":"2278","DOI":"10.1109\/5.726791","article-title":"Gradient-based learning applied to document recognition","volume":"86","author":"LeCun","year":"1998","journal-title":"Proc. IEEE"},{"key":"B13","doi-asserted-by":"publisher","first-page":"67","DOI":"10.1007\/s00521-014-1550-z","article-title":"Empirical analysis: stock market prediction via extreme learning machine","volume":"27","author":"Li","year":"2016","journal-title":"Neural Comput. Appl"},{"key":"B14","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1508.04025","article-title":"Effective approaches to attention-based neural machine translation","author":"Luong","year":"2015","journal-title":"arXiv preprint arXiv:1508.04025"},{"key":"B15","doi-asserted-by":"publisher","first-page":"852","DOI":"10.1002\/for.2543","article-title":"Benchmark dataset for mid-price forecasting of limit order book data with machine learning methods","volume":"37","author":"Ntakaris","year":"2018","journal-title":"J. Forecast"},{"key":"B16","doi-asserted-by":"publisher","first-page":"497","DOI":"10.1016\/j.omega.2004.07.024","article-title":"A hybrid arima and support vector machines model in stock price forecasting","volume":"33","author":"Pai","year":"2005","journal-title":"Omega"},{"key":"B17","doi-asserted-by":"publisher","first-page":"1449","DOI":"10.1080\/14697688.2019.1622295","article-title":"Universal features of price formation in financial markets: perspectives from deep learning","volume":"19","author":"Sirignano","year":"2019","journal-title":"Quant. Finan."},{"key":"B18","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1109\/HSI52170.2021.9538640","article-title":"\u201cMulti-head self-attention transformer for dogecoin price prediction,\u201d","author":"Sridhar","year":"2021"},{"key":"B19","doi-asserted-by":"publisher","first-page":"7","DOI":"10.1109\/CBI.2017.23","article-title":"\u201cForecasting stock prices from the limit order book using convolutional neural networks,\u201d","author":"Tsantekidis","year":"2017"},{"key":"B20","doi-asserted-by":"publisher","first-page":"106401","DOI":"10.1016\/j.asoc.2020.106401","article-title":"Using deep learning for price prediction by exploiting stationary limit order book features","volume":"93","author":"Tsantekidis","year":"2020","journal-title":"Appl. Soft Comput"},{"key":"B21","doi-asserted-by":"publisher","DOI":"10.5555\/3295222.3295349","article-title":"Attention is all you need","author":"Vaswani","year":"2017","journal-title":"Adv. Neural Inf. Process. Syst"},{"key":"B22","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2003.00130","article-title":"Transformers for limit order books","author":"Wallbridge","year":"2020","journal-title":"arXiv preprint arXiv:2003.00130"},{"key":"B23","doi-asserted-by":"publisher","first-page":"79","DOI":"10.1080\/1350486X.2021.1967767","article-title":"Deep learning for market by order data","volume":"28","author":"Zhang","year":"2021","journal-title":"Appl. Math. Finance"},{"key":"B24","doi-asserted-by":"publisher","first-page":"3001","DOI":"10.1109\/TSP.2019.2907260","article-title":"Deeplob: deep convolutional neural networks for limit order books","volume":"67","author":"Zhang","year":"2019","journal-title":"IEEE Trans. Signal Process"},{"key":"B25","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1204.1381","article-title":"Price jump prediction in limit order book","author":"Zheng","year":"2012","journal-title":"arXiv preprint arXiv:1204.1381"}],"container-title":["Frontiers in Artificial Intelligence"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/frai.2025.1616485\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,13]],"date-time":"2025-10-13T10:35:54Z","timestamp":1760351754000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/frai.2025.1616485\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,13]]},"references-count":25,"alternative-id":["10.3389\/frai.2025.1616485"],"URL":"https:\/\/doi.org\/10.3389\/frai.2025.1616485","relation":{},"ISSN":["2624-8212"],"issn-type":[{"value":"2624-8212","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,10,13]]},"article-number":"1616485"}}