{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,17]],"date-time":"2026-04-17T16:17:38Z","timestamp":1776442658156,"version":"3.51.2"},"reference-count":42,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2023,8,14]],"date-time":"2023-08-14T00:00:00Z","timestamp":1691971200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,8,14]],"date-time":"2023-08-14T00:00:00Z","timestamp":1691971200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100004731","name":"Natural Science Foundation of Zhejiang Province","doi-asserted-by":"publisher","award":["LQ23F020014"],"award-info":[{"award-number":["LQ23F020014"]}],"id":[{"id":"10.13039\/501100004731","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62271177"],"award-info":[{"award-number":["62271177"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Complex Intell. Syst."],"published-print":{"date-parts":[[2024,2]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Although the short-text retrieval model by BERT achieves significant performance improvement, research on the efficiency and performance of long-text retrieval still faces challenges. Therefore, this study proposes an efficient long-text retrieval model based on BERT (called LTR-BERT). This model achieves speed improvement while retaining most of the long-text retrieval performance. In particular, The LTR-BERT model is trained by using the relevance between short texts. Then, the long text is segmented and stored off-line. In the retrieval stage, only the coding of the query and the matching scores are calculated, which speeds up the retrieval. Moreover, a query expansion strategy is designed to enhance the representation of the original query and reserve the encoding region for the query. It is beneficial for learning missing information in the representation stage. The interaction mechanism without training parameters takes into account the local semantic details and the whole relevance to ensure the accuracy of retrieval and further shorten the response time. Experiments are carried out on MS MARCO Document Ranking dataset, which is specially designed for long-text retrieval. Compared with the interaction-focused semantic matching method by BERT-CLS, the MRR@10 values of the proposed LTR-BERT method are increased by 2.74%. Moreover, the number of documents processed per millisecond increased by 333 times.<\/jats:p>","DOI":"10.1007\/s40747-023-01192-3","type":"journal-article","created":{"date-parts":[[2023,8,14]],"date-time":"2023-08-14T20:04:16Z","timestamp":1692043456000},"page":"963-979","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["An efficient long-text semantic retrieval approach via utilizing presentation learning on short-text"],"prefix":"10.1007","volume":"10","author":[{"given":"Junmei","family":"Wang","sequence":"first","affiliation":[]},{"given":"Jimmy X.","family":"Huang","sequence":"additional","affiliation":[]},{"given":"Jinhua","family":"Sheng","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,8,14]]},"reference":[{"key":"1192_CR1","doi-asserted-by":"crossref","unstructured":"Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. In: Proc. 16th conf. North Am. chapter assoc. comput. linguist., pp 2227\u20132237. http:\/\/arxiv.org\/abs\/1802.05365","DOI":"10.18653\/v1\/N18-1202"},{"key":"1192_CR2","unstructured":"Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proc. 17th conf. North Am. chapter assoc. comput. linguist. hum. lang. technol., Minneapolis, USA, pp 4171\u20134186. http:\/\/arxiv.org\/abs\/1810.04805"},{"key":"1192_CR3","doi-asserted-by":"publisher","first-page":"463","DOI":"10.1007\/s40747-022-00819-1","volume":"9","author":"C Liu","year":"2023","unstructured":"Liu C, Zhu W, Zhang X, Zhai Q (2023) Sentence part-enhanced BERT with respect to downstream tasks. Complex Intell Syst 9:463\u2013474. https:\/\/doi.org\/10.1007\/s40747-022-00819-1","journal-title":"Complex Intell Syst"},{"key":"1192_CR4","doi-asserted-by":"publisher","first-page":"493","DOI":"10.1007\/s40747-020-00147-2","volume":"6","author":"Y Wang","year":"2020","unstructured":"Wang Y, Rong W, Zhang J, Zhou S, Xiong Z (2020) Multi-turn dialogue-oriented pretrained question generation model. Complex Intell Syst 6:493\u2013505. https:\/\/doi.org\/10.1007\/s40747-020-00147-2","journal-title":"Complex Intell Syst"},{"key":"1192_CR5","doi-asserted-by":"publisher","unstructured":"Dai Z, Callan J (2019) Deeper text understanding for IR with contextual neural language modeling. In: Proc. 42nd int. ACM SIGIR conf. res. dev. inf. Retrieval (SIGIR\u201919), pp 985\u2013988. https:\/\/doi.org\/10.1145\/3331184.3331303","DOI":"10.1145\/3331184.3331303"},{"key":"1192_CR6","doi-asserted-by":"publisher","unstructured":"MacAvaney S, Yates A, Cohan A, Goharian N (2019) CEDR: contextualized embeddings for document ranking. In: Proc. 42nd int. ACM SIGIR conf. res. dev. inf. Retrieval (SIGIR\u201919). ACM, New York, USA, pp 1101\u20131104. https:\/\/doi.org\/10.1145\/3331184.3331317","DOI":"10.1145\/3331184.3331317"},{"key":"1192_CR7","doi-asserted-by":"publisher","unstructured":"Boualili L, Moreno JG, Boughanem M (2020) MarkedBERT: integrating traditional IR cues in pre-trained language models for passage retrieval. In: Proc. 43rd int. ACM SIGIR conf. res. dev. inf. Retrieval (SIGIR\u201920), pp 1977\u20131980. https:\/\/doi.org\/10.1145\/3397271.3401194","DOI":"10.1145\/3397271.3401194"},{"key":"1192_CR8","doi-asserted-by":"publisher","unstructured":"Akkalyoncu Yilmaz Z, Yang W, Zhang H, Lin J (2019) Cross-domain modeling of sentence-level evidence for document retrieval. In: Proc. 2019 conf. empir. methods nat. lang. process. 9th int. jt. conf. nat. lang. process. Association for Computational Linguistics, Stroudsburg, PA, USA, pp 3488\u20133494. https:\/\/doi.org\/10.18653\/v1\/D19-1352.","DOI":"10.18653\/v1\/D19-1352"},{"key":"1192_CR9","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1561\/1500000061","volume":"13","author":"B Mitra","year":"2018","unstructured":"Mitra B, Craswell N (2018) An introduction to neural information retrieval. Found Inf Retr 13:1\u2013126. https:\/\/doi.org\/10.1561\/1500000061","journal-title":"Found Inf Retr"},{"key":"1192_CR10","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2021.102734","volume":"59","author":"M Pan","year":"2022","unstructured":"Pan M, Wang J, Huang JX, Huang AJ, Chen Q, Chen J (2022) A probabilistic framework for integrating sentence-level semantics via BERT into pseudo-relevance feedback. Inf Process Manage 59:102734. https:\/\/doi.org\/10.1016\/j.ipm.2021.102734","journal-title":"Inf Process Manage"},{"key":"1192_CR11","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2020.102342","volume":"57","author":"J Wang","year":"2020","unstructured":"Wang J, Pan M, He T, Huang X, Wang X, Tu X (2020) A pseudo-relevance feedback framework combining relevance matching and semantic matching for information retrieval. Inf Process Manage 57:102342. https:\/\/doi.org\/10.1016\/j.ipm.2020.102342","journal-title":"Inf Process Manage"},{"key":"1192_CR12","doi-asserted-by":"publisher","unstructured":"Khattab O, Zaharia M (2020) ColBERT: efficient and effective passage search via contextualized late interaction over BERT. In: Proc. 43rd int. ACM SIGIR conf. res. dev. inf. retrieval (SIGIR\u201920), pp 39\u201348. https:\/\/doi.org\/10.1145\/3397271.3401075.","DOI":"10.1145\/3397271.3401075"},{"key":"1192_CR13","doi-asserted-by":"publisher","unstructured":"Nie P, Zhang Y, Geng X, Ramamurthy A, Song L, Jiang D (2020) DC-BERT: decoupling question and document for efficient contextual encoding. In: Proc. 43rd int. ACM SIGIR conf. res. dev. inf. retrieval (SIGIR\u201920), pp 1829\u20131832. https:\/\/doi.org\/10.1145\/3397271.3401271","DOI":"10.1145\/3397271.3401271"},{"key":"1192_CR14","unstructured":"Nogueira R, Cho K (2019) Passage re-ranking with BERT. arXiv: 1901.04085. http:\/\/arxiv.org\/abs\/1901.04085"},{"key":"1192_CR15","doi-asserted-by":"publisher","unstructured":"Hofst\u00e4tter S, Zamani H, Mitra B, Craswell N, Hanbury A (2020) Local self-attention over long text for efficient document retrieval. In: Proc. 43rd int. ACM SIGIR conf. res. dev. inf. retr. ACM, New York, USA, pp 2021\u20132024. https:\/\/doi.org\/10.1145\/3397271.3401224","DOI":"10.1145\/3397271.3401224"},{"key":"1192_CR16","doi-asserted-by":"publisher","first-page":"1733","DOI":"10.1007\/s11071-021-06208-6","volume":"103","author":"T Wei","year":"2021","unstructured":"Wei T, Li X, Stojanovic V (2021) Input-to-state stability of impulsive reaction\u2013diffusion neural networks with infinite distributed delays. Nonlinear Dyn 103:1733\u20131755. https:\/\/doi.org\/10.1007\/s11071-021-06208-6","journal-title":"Nonlinear Dyn"},{"key":"1192_CR17","doi-asserted-by":"publisher","DOI":"10.1016\/j.nahs.2021.101088","volume":"42","author":"Z Xu","year":"2021","unstructured":"Xu Z, Li X, Stojanovic V (2021) Exponential stability of nonlinear state-dependent delayed impulsive systems with applications. Nonlinear Anal Hybrid Syst 42:101088. https:\/\/doi.org\/10.1016\/j.nahs.2021.101088","journal-title":"Nonlinear Anal Hybrid Syst"},{"key":"1192_CR18","doi-asserted-by":"publisher","DOI":"10.1145\/3485447.3511957","author":"S Xiao","year":"2022","unstructured":"Xiao S, Liu Z, Han W, Zhang J, Shao Y, Lian D, Li C, Sun H, Deng D, Zhang L, Zhang Q, Xie X (2022) Progressively optimized bi-granular document representation for scalable embedding based retrieval. Assoc Comput Mach. https:\/\/doi.org\/10.1145\/3485447.3511957","journal-title":"Assoc Comput Mach"},{"key":"1192_CR19","doi-asserted-by":"publisher","unstructured":"Yilmaz ZA, Wang S, Yang W, Zhang H, Lin J (2020) Applying BERT to document retrieval with birch. In: Proc. conf. empir. methods nat. lang. process. 9th int. jt. conf. nat. lang. process., pp 19\u201324. https:\/\/doi.org\/10.18653\/v1\/d19-3004","DOI":"10.18653\/v1\/d19-3004"},{"key":"1192_CR20","first-page":"2042","volume":"3","author":"B Hu","year":"2015","unstructured":"Hu B, Lu Z, Li H, Chen Q (2015) Convolutional neural network architectures for matching natural language sentences. Adv Neural Inf Process Syst 3:2042\u20132050","journal-title":"Adv Neural Inf Process Syst"},{"key":"1192_CR21","doi-asserted-by":"crossref","unstructured":"Pang L, Lan Y, Guo J, Xu J, Wan S, Cheng X (2016) Text matching as image recognition. In: Proc. 30th AAAI conf. artif. intell., pp 2793\u20132799. http:\/\/arxiv.org\/abs\/1602.06359","DOI":"10.1609\/aaai.v30i1.10341"},{"key":"1192_CR22","doi-asserted-by":"publisher","unstructured":"Hui K, Yates A, Berberich K, de Melo G (2017) PACRR: a position-aware neural IR model for relevance matching. In: Proc. 2017 conf. empir. methods nat. lang. process. Association for Computational Linguistics, Stroudsburg, PA, USA, pp 1049\u20131058. https:\/\/doi.org\/10.18653\/v1\/D17-1110.","DOI":"10.18653\/v1\/D17-1110"},{"key":"1192_CR23","doi-asserted-by":"publisher","unstructured":"Hui K, Yates A, Berberich K, de Melo G (2018) Co-PACRR: a context-aware neural IR model for ad-hoc retrieval. In: Proc. 11th ACM int. conf. web search data mining (WSDM\u201918). ACM, New York, NY, USA, pp 279\u2013287. https:\/\/doi.org\/10.1145\/3159652.3159689","DOI":"10.1145\/3159652.3159689"},{"key":"1192_CR24","doi-asserted-by":"publisher","unstructured":"Xiong C, Dai Z, Callan J, Liu Z, Power R (2017) End-to-end neural ad-hoc ranking with kernel pooling. In: Proc. 40th int. ACM SIGIR conf. res. dev. inf. retr. Association for Computing Machinery, Inc, pp 55\u201364. https:\/\/doi.org\/10.1145\/3077136.3080809","DOI":"10.1145\/3077136.3080809"},{"key":"1192_CR25","doi-asserted-by":"publisher","unstructured":"Dai Z, Xiong C, Callan J, Liu Z (2018) Convolutional neural networks for soft-matching n-grams in ad-hoc search. In: Proc. 11th ACM int. conf. web search data mining (WSDM\u201918). ACM, New York, NY, USA, pp 126\u2013134. https:\/\/doi.org\/10.1145\/3159652.3159659","DOI":"10.1145\/3159652.3159659"},{"key":"1192_CR26","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3239571","volume":"10","author":"P Yang","year":"2018","unstructured":"Yang P, Fang H, Lin J (2018) Anserini: reproducible ranking baselines using lucene. J Data Inf Qual 10:1\u201320. https:\/\/doi.org\/10.1145\/3239571","journal-title":"J Data Inf Qual"},{"key":"1192_CR27","doi-asserted-by":"publisher","unstructured":"Huang P-S, He X, Gao J, Deng L, Acero A, Heck L (2013) Learning deep structured semantic models for web search using click through data. In: Proc. 22nd ACM int. conf. conf. inf. knowl. manag., pp 2333\u20132338. https:\/\/doi.org\/10.1145\/2505515.2505665","DOI":"10.1145\/2505515.2505665"},{"key":"1192_CR28","doi-asserted-by":"publisher","unstructured":"Shen Y, He X, Gao J, Deng L, Mesnil G (2014) Learning semantic representations using convolutional neural networks for web search. In: Proc. 23rd int. conf. world wide web, pp 373\u2013374. https:\/\/doi.org\/10.1145\/2567948.2577348.","DOI":"10.1145\/2567948.2577348"},{"key":"1192_CR29","doi-asserted-by":"publisher","unstructured":"Shen Y, He X, Gao J, Deng L, Mesnil G (2014) A latent semantic model with convolutional-pooling structure for information retrieval. In: Proc. 23rd ACM int. conf. inf. knowl. manag., pp 101\u2013110. https:\/\/doi.org\/10.1145\/2661829.2661935","DOI":"10.1145\/2661829.2661935"},{"key":"1192_CR30","doi-asserted-by":"publisher","unstructured":"Guo J, Fan Y, Ai Q, Croft WB (2016) A deep relevance matching model for ad-hoc retrieval. In: Proc. 25th ACM int. conf. inf. knowl. manag. ACM, New York, USA, pp 55\u201364. https:\/\/doi.org\/10.1145\/2983323.2983769.","DOI":"10.1145\/2983323.2983769"},{"key":"1192_CR31","doi-asserted-by":"publisher","unstructured":"Zamani H, Dehghani M, Croft WB, Learned-Miller E, Kamps J (2018) From neural re-ranking to neural ranking: learning a sparse representation for inverted indexing hamed. In: Proc. 27th ACM int. conf. inf. knowl. manag. ACM, New York, USA, pp 497\u2013506. https:\/\/doi.org\/10.1145\/3269206.3271800","DOI":"10.1145\/3269206.3271800"},{"key":"1192_CR32","unstructured":"Beltagy I, Peters ME, Cohan A (2020) Longformer: the long-document transformer, ArXiv: 2004.0515v1. http:\/\/arxiv.org\/abs\/2004.05150"},{"key":"1192_CR33","unstructured":"Ding M, Zhou C, Yang H, Tang J (2020) CogLTX: applying BERT to long texts. In: Proc. 34th int. conf. neural inf. process. syst., pp 12792\u201312804. https:\/\/github.com\/Sleepychord\/CogLTX"},{"key":"1192_CR34","doi-asserted-by":"publisher","first-page":"3461","DOI":"10.1109\/TSMC.2022.3225381","volume":"53","author":"Z Zhuang","year":"2022","unstructured":"Zhuang Z, Tao H, Chen Y, Stojanovic V, Paszke W (2022) An optimal iterative learning control approach for linear systems with nonuniform trial lengths under input constraints. IEEE Trans Syst Man Cybern Syst 53:3461\u20133473. https:\/\/doi.org\/10.1109\/TSMC.2022.3225381","journal-title":"IEEE Trans Syst Man Cybern Syst"},{"key":"1192_CR35","doi-asserted-by":"publisher","first-page":"10139","DOI":"10.1002\/rnc.6354","volume":"32","author":"C Zhou","year":"2022","unstructured":"Zhou C, Tao H, Chen Y, Stojanovic V, Paszke W (2022) Robust point-to-point iterative learning control for constrained systems: a minimum energy approach. Int J Robust Nonlinear Control 32:10139\u201310161. https:\/\/doi.org\/10.1002\/rnc.6354","journal-title":"Int J Robust Nonlinear Control"},{"key":"1192_CR36","doi-asserted-by":"publisher","first-page":"251","DOI":"10.1186\/s12911-019-0986-6","volume":"19","author":"M Pan","year":"2019","unstructured":"Pan M, Zhang Y, Zhu Q, Sun B, He T, Jiang X (2019) An adaptive term proximity based Rocchio\u2019s model for clinical decision support retrieval. BMC Med Inform Decision Mak 19:251. https:\/\/doi.org\/10.1186\/s12911-019-0986-6","journal-title":"BMC Med Inform Decision Mak"},{"key":"1192_CR37","doi-asserted-by":"publisher","unstructured":"MacAvaney S, Nardini FM, Perego R, Tonellotto N, Goharian N, Frieder O (2020) Efficient document re-ranking for transformers by precomputing term representations. In: Proc. 43rd int. ACM SIGIR conf. res. dev. inf. retr., pp 49\u201358. https:\/\/doi.org\/10.1145\/3397271.3401093","DOI":"10.1145\/3397271.3401093"},{"key":"1192_CR38","unstructured":"Bajaj P, Campos D, Craswell N, Deng L, Gao J, Liu X, Majumder R, McNamara A, Mitra B, Nguyen T, Rosenberg M, Song X, Stoica A, Tiwary S, Wang T (2016) MS MARCO: a human generated machine reading comprehension dataset. In: Proc. 30th conf. neural inf. process. syst., pp 1\u201311. http:\/\/arxiv.org\/abs\/1611.09268"},{"key":"1192_CR39","doi-asserted-by":"publisher","DOI":"10.1007\/s11704-022-2041-5","volume":"17","author":"J Wang","year":"2023","unstructured":"Wang J, Zhao W, Tu X, He T (2023) A novel dense retrieval framework for long document retrieval. Front Comput Sci 17:174609. https:\/\/doi.org\/10.1007\/s11704-022-2041-5","journal-title":"Front Comput Sci"},{"key":"1192_CR40","doi-asserted-by":"publisher","first-page":"1201","DOI":"10.1109\/TKDE.2012.24","volume":"25","author":"X Yin","year":"2013","unstructured":"Yin X, Huang JX, Li Z, Zhou X (2013) A survival modeling approach to biomedical search result diversification using wikipedia. IEEE Trans Knowl Data Eng 25:1201\u20131212. https:\/\/doi.org\/10.1109\/TKDE.2012.24","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"1192_CR41","unstructured":"Huang X, Zhong M, Si L (2005) York University at {TREC} 2005: Genomics track. In: Voorhees EM, Buckland LP (eds) Proceedings of the Fourteenth Text REtrieval Conference, National Institute of Standards and Technology (NIST), Gaithersburg, Maryland. http:\/\/trec.nist.gov\/pubs\/trec14\/papers\/yorkuhuang2.geo.pdf"},{"key":"1192_CR42","doi-asserted-by":"publisher","unstructured":"Huang X, Hu Q (2009) A bayesian learning approach to promoting diversity in ranking for biomedical information retrieval. In: Proceedings of the 32nd annual international acm sigir conference on research and development in information retrieval, SIGIR 2009, Boston, MA, USA. ACM Press, New York, USA, pp 307\u2013314. https:\/\/doi.org\/10.1145\/1571941.1571995","DOI":"10.1145\/1571941.1571995"}],"container-title":["Complex &amp; Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-023-01192-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s40747-023-01192-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-023-01192-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,10]],"date-time":"2024-02-10T22:28:21Z","timestamp":1707604101000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s40747-023-01192-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,8,14]]},"references-count":42,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2024,2]]}},"alternative-id":["1192"],"URL":"https:\/\/doi.org\/10.1007\/s40747-023-01192-3","relation":{},"ISSN":["2199-4536","2198-6053"],"issn-type":[{"value":"2199-4536","type":"print"},{"value":"2198-6053","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,8,14]]},"assertion":[{"value":"20 March 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"15 July 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"14 August 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}