{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,17]],"date-time":"2026-03-17T17:28:55Z","timestamp":1773768535564,"version":"3.50.1"},"reference-count":33,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2026,3,17]],"date-time":"2026-03-17T00:00:00Z","timestamp":1773705600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Artif. Intell."],"abstract":"<jats:p>We introduce a biologically inspired bird-flocking experimental framework for text summarization that identifies the most salient sentences using contextual information, sentence position, and thematic relevance. The bird-flocking-inspired algorithm, combined with large language models (LLMs), generates summaries with greater factual accuracy. The algorithm ensures source faithfulness by preventing the generation of new, unsupported information, thereby mitigating the risk of model hallucination by grounding the summary exclusively in the original text. While large language models (LLMs) achieve remarkable fluency in abstractive summarization, they frequently hallucinate generating plausible but unsupported content. We introduce a bio-inspired bird-flocking framework that addresses this limitation by serving as a preprocessing step for LLM-based summarization. Our method identifies the most salient, source-faithful sentences using contextual information, sentence position, and thematic relevance, providing LLMs with factually grounded input that constrains generation to verified content. Experimental results show that our methodology consistently produces concise and factually correct summaries, as experimented with the commonly used quality measurement scores. The framework provides a mechanism for text summarization that incorporates unified stop-word control, collocation recognition with synonym expansion, attention combination with fallback, score normalization between global and local saliency, and an unsupervised learning bio-inspired Flock-by-Leader text clustering algorithm. These components contribute not only to improved consistency and diversity of the summary, but also to reduced hallucinations in text summarization. The algorithms and experimental framework proposed in this study serve as an efficient preprocessing step that complements both conventional and generative AI-based text summarization methods. The framework produces a well-structured intermediate representation of the source document, which is then provided to the LLM to generate the final summary. Across over 9,000 long-form documents in healthcare and energy, our framework consistently outperforms a major large language model baseline, with gains of 7.28% in ROUGE-1, 6.19% in ROUGE-L, and 45.28% in entity coverage.<\/jats:p>","DOI":"10.3389\/frai.2026.1703769","type":"journal-article","created":{"date-parts":[[2026,3,17]],"date-time":"2026-03-17T14:29:30Z","timestamp":1773757770000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["A bird-inspired artificial intelligence framework for advanced large text summarization"],"prefix":"10.3389","volume":"9","author":[{"given":"Binxu","family":"Huang","sequence":"first","affiliation":[{"name":"Department of Computer Science, Courant Institute of Mathematical Sciences, New York University","place":["New York, NY, United States"]}]},{"given":"Anasse","family":"Bari","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Courant Institute of Mathematical Sciences, New York University","place":["New York, NY, United States"]}]}],"member":"1965","published-online":{"date-parts":[[2026,3,17]]},"reference":[{"key":"B1","doi-asserted-by":"publisher","first-page":"333","DOI":"10.1145\/360825.360855","article-title":"Efficient string matching: an aid to bibliographic search","volume":"18","author":"Aho","year":"1975","journal-title":"Commun. ACM"},{"key":"B2","doi-asserted-by":"publisher","DOI":"10.14569\/IJACSA.2017.081052","article-title":"Text summarization techniques: a brief survey","author":"Allahyari","year":"2017","journal-title":"arXiv"},{"key":"B3","doi-asserted-by":"publisher","first-page":"117","DOI":"10.1007\/978-3-642-31020-1_15","article-title":"\u201cFlock by leader: a novel machine learning biologically inspired clustering algorithm,\u201d","author":"Bellaachia","year":"2012","journal-title":"Advances in Swarm Intelligence"},{"key":"B4","doi-asserted-by":"publisher","first-page":"993","DOI":"10.7551\/mitpress\/1120.003.0082","article-title":"Latent dirichlet allocation","volume":"3","author":"Blei","year":"2003","journal-title":"J. Machine Learn. Res"},{"key":"B5","doi-asserted-by":"publisher","first-page":"1877","DOI":"10.52202\/079017-0617","article-title":"Language models are few-shot learners","volume":"33","author":"Brown","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst"},{"key":"B6","doi-asserted-by":"publisher","first-page":"505","DOI":"10.1016\/j.sysarc.2006.02.003","article-title":"A flocking based algorithm for document clustering analysis","volume":"52","author":"Cui","year":"2006","journal-title":"J. Syst. Architect"},{"key":"B7","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2401.02954","article-title":"DeepSeek LLM: scaling open-source language models with longtermism","year":"2024","journal-title":"arXiv"},{"key":"B8","doi-asserted-by":"publisher","first-page":"264","DOI":"10.1145\/321510.321519","article-title":"New methods in automatic extracting","volume":"16","author":"Edmundson","year":"1969","journal-title":"J. ACM"},{"key":"B9","doi-asserted-by":"crossref","DOI":"10.7551\/mitpress\/7287.001.0001","volume-title":"WordNet: An Electronic Lexical Database","author":"Fellbaum","year":"1998"},{"key":"B10","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/s10462-016-9475-9","article-title":"Recent automatic text summarization techniques: a survey","volume":"47","author":"Gambhir","year":"2017","journal-title":"Artif. Intell. Rev"},{"key":"B11","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3571730","article-title":"Survey of hallucination in natural language generation","volume":"55","author":"Ji","year":"2023","journal-title":"ACM Comput. Surv"},{"key":"B12","unstructured":"LENR Dashboard\n          \n          2026"},{"key":"B13","doi-asserted-by":"crossref","first-page":"7871","DOI":"10.18653\/v1\/2020.acl-main.703","article-title":"\u201cBART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension,\u201d","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Lewis","year":"2020"},{"key":"B14","first-page":"74","article-title":"\u201cROUGE: A package for automatic evaluation of summaries,\u201d","volume-title":"Text Summarization Branches Out","author":"Lin","year":"2004"},{"key":"B15","doi-asserted-by":"publisher","first-page":"157","DOI":"10.1162\/tacl_a_00638","article-title":"Lost in the middle: how language models use long contexts","volume":"12","author":"Liu","year":"2024","journal-title":"Trans. Assoc. Comput. Linguist"},{"key":"B16","first-page":"3730","article-title":"\u201cText summarization with pretrained encoders,\u201d","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Liu","year":"2019"},{"key":"B17","doi-asserted-by":"crossref","first-page":"473","DOI":"10.1007\/978-3-642-04346-8_62","article-title":"\u201cGROBID: Combining automatic bibliographic data recognition and term extraction for scholarship publications,\u201d","volume-title":"Research and Advanced Technology for Digital Libraries (ECDL 2009)","author":"Lopez","year":"2009"},{"key":"B18","doi-asserted-by":"publisher","first-page":"159","DOI":"10.1147\/rd.22.0159","article-title":"The automatic creation of literature abstracts","volume":"2","author":"Luhn","year":"1958","journal-title":"IBM J. Res. Dev"},{"key":"B19","doi-asserted-by":"crossref","first-page":"55","DOI":"10.3115\/v1\/P14-5010","article-title":"\u201cThe Stanford CoreNLP natural language processing Toolkit,\u201d","volume-title":"Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations","author":"Manning","year":"2014"},{"key":"B20","first-page":"404","article-title":"\u201cTextRank: Bringing order into texts,\u201d","volume-title":"Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing","author":"Mihalcea","year":"2004"},{"key":"B21","doi-asserted-by":"crossref","first-page":"280","DOI":"10.18653\/v1\/K16-1028","article-title":"\u201cAbstractive text summarization using sequence-to-sequence RNNs and beyond,\u201d","volume-title":"Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning","author":"Nallapati","year":"2016"},{"key":"B22","unstructured":"Pubmed Central (PMC)\n          \n          2025"},{"key":"B23","doi-asserted-by":"crossref","DOI":"10.1201\/9780429445927","volume-title":"Advances in Swarm Intelligence for Optimizing Problems in Computer Science","author":"Nayyar","year":"2018"},{"key":"B24","unstructured":"Why Language Models Hallucinate (and how to get them to tell the truth)\n          \n          2025"},{"key":"B25","unstructured":"The Oncologist\n          \n          2025"},{"key":"B26","first-page":"8748","article-title":"\u201cLearning transferable visual models from natural language supervision,\u201d","volume-title":"Proceedings of the 38th International Conference on Machine Learning","author":"Radford","year":"2021"},{"key":"B27","doi-asserted-by":"publisher","first-page":"1","DOI":"10.4018\/978-1-5225-2375-8.ch003","article-title":"\u201cBio-inspired algorithms for text summarization: a review,\u201d","author":"Rautray","year":"2017","journal-title":"Bio-Inspired Computing for Information Retrieval Applications"},{"key":"B28","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/D19-1410","article-title":"\u201cSentence-BERT: Sentence embeddings using siamese BERT-networks,\u201d","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Reimers","year":"2019"},{"key":"B29","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1145\/37401.37406","article-title":"\u201cFlocks, herds and schools: a distributed behavioral model,\u201d","volume-title":"Proceedings of the 14th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '87)","author":"Reynolds","year":"1987"},{"key":"B30","doi-asserted-by":"crossref","first-page":"379","DOI":"10.18653\/v1\/D15-1044","article-title":"\u201cA neural attention model for abstractive sentence summarization,\u201d","volume-title":"Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing","author":"Rush","year":"2015"},{"key":"B31","first-page":"1073","article-title":"\u201cGet to the point: Summarization with pointer-generator networks,\u201d","volume-title":"Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics","author":"See","year":"2017"},{"key":"B32","volume-title":"Future Shock","author":"Toffler","year":"1970"},{"key":"B33","first-page":"11328","article-title":"\u201cPEGASUS: Pre-training with extracted gap-sentences for abstractive summarization,\u201d","volume-title":"Proceedings of the 37th International Conference on Machine Learning","author":"Zhang","year":"2020"}],"container-title":["Frontiers in Artificial Intelligence"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/frai.2026.1703769\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,17]],"date-time":"2026-03-17T14:29:35Z","timestamp":1773757775000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/frai.2026.1703769\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,3,17]]},"references-count":33,"alternative-id":["10.3389\/frai.2026.1703769"],"URL":"https:\/\/doi.org\/10.3389\/frai.2026.1703769","relation":{},"ISSN":["2624-8212"],"issn-type":[{"value":"2624-8212","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,3,17]]},"article-number":"1703769"}}