{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,20]],"date-time":"2025-09-20T03:16:21Z","timestamp":1758338181037,"version":"3.44.0"},"reference-count":46,"publisher":"Oxford University Press (OUP)","issue":"9","license":[{"start":{"date-parts":[[2025,8,28]],"date-time":"2025-08-28T00:00:00Z","timestamp":1756339200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62272358"],"award-info":[{"award-number":["62272358"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,9,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Pangenome indexing is a critical supporting technology in biological sequence analysis such as read alignment applications. The need to accurately identify billions of small sequencing fragments carrying sequencing errors and genomic variants drives the development of scalable and efficient pangenome indexing approach.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We propose a new wavelet tree-based approach, called Panaln, for indexing pangenome and introduce a batch computation approach for fast count query over Panaln. We present a simple and effective seeding strategy and develop a pangenome program that uses the seed-and-extend paradigm for read alignment. Experimental results on simulated and real data demonstrate that Panaln uses significantly less space for the compared pangenome methods with generally higher accuracy. We provide a scalable index construction by representing pangenome with a linear model. Additionally, Panaln brings enhanced accuracy compared to the popular single reference methods.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>Package: https:\/\/anaconda.org\/bioconda\/panaln and source code: https:\/\/github.com\/Lilu-guo\/Panaln.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaf476","type":"journal-article","created":{"date-parts":[[2025,8,28]],"date-time":"2025-08-28T16:16:46Z","timestamp":1756397806000},"source":"Crossref","is-referenced-by-count":0,"title":["Panaln: indexing pangenome for read alignment"],"prefix":"10.1093","volume":"41","author":[{"ORCID":"https:\/\/orcid.org\/0009-0002-4647-7202","authenticated-orcid":false,"given":"Lilu","family":"Guo","sequence":"first","affiliation":[{"name":"Department of Computer Science, Xidian University , Xi\u2019an 710071,","place":["China"]}]},{"given":"Zongtao","family":"He","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Xidian University , Xi\u2019an 710071,","place":["China"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5436-1851","authenticated-orcid":false,"given":"Hongwei","family":"Huo","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Xidian University , Xi\u2019an 710071,","place":["China"]}]}],"member":"286","published-online":{"date-parts":[[2025,8,28]]},"reference":[{"key":"2025091918573986300_btaf476-B1","doi-asserted-by":"crossref","first-page":"249","DOI":"10.1186\/s13059-021-02443-7","article-title":"Technology dictates algorithms: recent developments in read alignment","volume":"22","author":"Alser","year":"2021","journal-title":"Genome Biol"},{"key":"2025091918573986300_btaf476-B2","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1007\/s11047-022-09882-6","article-title":"Computational graph pangenomics: a tutorial on data structures and their applications","volume":"21","author":"Baaijens","year":"2022","journal-title":"Nat Comput"},{"article-title":"A block-sorting lossless data compression algorithm","year":"1994","author":"Burrows","key":"2025091918573986300_btaf476-B3"},{"key":"2025091918573986300_btaf476-B4","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1109\/JPROC.2015.2455551","article-title":"Short read mapping: an algorithmic tour","volume":"105","author":"Canzar","year":"2017","journal-title":"Proc IEEE Inst Electr Electron Eng"},{"key":"2025091918573986300_btaf476-B5","doi-asserted-by":"crossref","first-page":"1265","DOI":"10.1101\/gr.279143.124","article-title":"Haplotype-aware sequence alignment to pangenome graphs","volume":"34","author":"Chandra","year":"2024","journal-title":"Genome Res"},{"key":"2025091918573986300_btaf476-B6","doi-asserted-by":"crossref","first-page":"3021","DOI":"10.1093\/nar\/13.9.3021","article-title":"Nomenclature for incompletely specified bases in nucleic acid sequences: recommendations 1984","volume":"13","author":"Cornish-Bowden","year":"1985","journal-title":"Nucleic Acids Res"},{"key":"2025091918573986300_btaf476-B7","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1146\/annurev-genom-120219-080406","article-title":"Pangenome graphs","volume":"21","author":"Eizenga","year":"2020","journal-title":"Annu Rev Genomics Hum Genet"},{"first-page":"390","year":"2000","author":"Ferragina","key":"2025091918573986300_btaf476-B8"},{"author":"Foschini","key":"2025091918573986300_btaf476-B9","first-page":"62"},{"key":"2025091918573986300_btaf476-B10","doi-asserted-by":"crossref","first-page":"875","DOI":"10.1038\/nbt.4227","article-title":"Variation graph toolkit improves read mapping by representing genetic variation in the reference","volume":"36","author":"Garrison","year":"2018","journal-title":"Nat Biotechnol"},{"author":"Grossi","key":"2025091918573986300_btaf476-B11","first-page":"841"},{"author":"Grossi","key":"2025091918573986300_btaf476-B12","first-page":"210"},{"key":"2025091918573986300_btaf476-B13","doi-asserted-by":"crossref","first-page":"108050","DOI":"10.1016\/j.compbiolchem.2024.108050","article-title":"An efficient burrows\u2013wheeler transform-based aligner for short read mapping","volume":"110","author":"Guo","year":"2024","journal-title":"Comput Biol Chem"},{"key":"2025091918573986300_btaf476-B14","doi-asserted-by":"crossref","first-page":"i361","DOI":"10.1093\/bioinformatics\/btt215","article-title":"Short read alignment with populations of genomes","volume":"29","author":"Huang","year":"2013","journal-title":"Bioinformatics"},{"author":"Huo","key":"2025091918573986300_btaf476-B15","first-page":"10"},{"key":"2025091918573986300_btaf476-B16","doi-asserted-by":"crossref","first-page":"2394","DOI":"10.1109\/TCBB.2020.2968323","article-title":"Efficient compression and indexing for highly repetitive DNA sequence collections","volume":"18","author":"Huo","year":"2021","journal-title":"IEEE\/ACM Trans Comput Biol Bioinform"},{"key":"2025091918573986300_btaf476-B17","doi-asserted-by":"crossref","first-page":"335","DOI":"10.1093\/bioinformatics\/btab655","article-title":"CIndex: compressed indexes for fast retrieval of FASTQ files","volume":"38","author":"Huo","year":"2022","journal-title":"Bioinformatics"},{"key":"2025091918573986300_btaf476-B18","doi-asserted-by":"crossref","first-page":"2943","DOI":"10.1109\/TKDE.2021.3114401","article-title":"Practical high-order entropy-compressed text self-indexing","volume":"35","author":"Huo","year":"2023","journal-title":"IEEE Trans Knowl Data Eng"},{"key":"2025091918573986300_btaf476-B19","first-page":"1","volume-title":"The Expanding World of Compressed Data","author":"Huo","year":"2025"},{"author":"Huo","key":"2025091918573986300_btaf476-B20","first-page":"2478"},{"key":"2025091918573986300_btaf476-B21","first-page":"290","volume-title":"Workshop on Algorithms in Bioinformatic (WABI), Aarhus, Denmark","author":"Iqbal","year":"2016"},{"key":"2025091918573986300_btaf476-B22","doi-asserted-by":"crossref","first-page":"907","DOI":"10.1038\/s41587-019-0201-4","article-title":"Graph-based genome alignment and genotyping with hisat2 and hisat-genotype","volume":"37","author":"Kim","year":"2019","journal-title":"Nat Biotechnol"},{"key":"2025091918573986300_btaf476-B23","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1038\/nmeth.1923","article-title":"Fast gapped-read alignment with bowtie 2","volume":"9","author":"Langmead","year":"2012","journal-title":"Nat Methods"},{"author":"Li","key":"2025091918573986300_btaf476-B24","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1303.3997,"},{"key":"2025091918573986300_btaf476-B25","doi-asserted-by":"crossref","first-page":"1754","DOI":"10.1093\/bioinformatics\/btp324","article-title":"Fast and accurate short read alignment with burrows\u2013wheeler transform","volume":"25","author":"Li","year":"2009","journal-title":"Bioinformatics"},{"key":"2025091918573986300_btaf476-B26","doi-asserted-by":"crossref","first-page":"265","DOI":"10.1186\/s13059-020-02168-z","article-title":"The design and construction of reference pangenome graphs with minigraph","volume":"21","author":"Li","year":"2020","journal-title":"Genome Biol"},{"key":"2025091918573986300_btaf476-B27","doi-asserted-by":"crossref","first-page":"312","DOI":"10.1038\/s41586-023-05896-x","article-title":"A draft human pangenome reference","volume":"617","author":"Liao","year":"2023","journal-title":"Nature"},{"key":"2025091918573986300_btaf476-B28","doi-asserted-by":"crossref","first-page":"935","DOI":"10.1137\/0222058","article-title":"Suffix arrays: a new method for on-line string searches","volume":"22","author":"Manber","year":"1993","journal-title":"SIAM J Comput"},{"key":"2025091918573986300_btaf476-B29","doi-asserted-by":"crossref","first-page":"407","DOI":"10.1145\/382780.382782","article-title":"An analysis of the burrows\u2013wheeler transform","volume":"48","author":"Manzini","year":"2001","journal-title":"J ACM"},{"key":"2025091918573986300_btaf476-B30","doi-asserted-by":"crossref","first-page":"456","DOI":"10.1093\/bioinformatics\/btaa777","article-title":"Fast gap-affine pairwise alignment using the wavefront algorithm","volume":"37","author":"Marco-Sola","year":"2021","journal-title":"Bioinformatics"},{"key":"2025091918573986300_btaf476-B31","doi-asserted-by":"crossref","DOI":"10.1093\/bioinformatics\/btad074","article-title":"Optimal gap-affine alignment in o (s) space","volume":"39","author":"Marco-Sola","year":"2023","journal-title":"Bioinformatics"},{"key":"2025091918573986300_btaf476-B32","first-page":"118","article-title":"Computational pan-genomics: status, promises and challenges","volume":"19","author":"Marschall","year":"2018","journal-title":"Brief Bioinform"},{"key":"2025091918573986300_btaf476-B33","doi-asserted-by":"crossref","first-page":"253","DOI":"10.1186\/s13059-020-02157-2","article-title":"Graphaligner: rapid and versatile sequence-to-graph alignment","volume":"21","author":"Rautiainen","year":"2020","journal-title":"Genome Biol"},{"key":"2025091918573986300_btaf476-B34","doi-asserted-by":"crossref","first-page":"3599","DOI":"10.1093\/bioinformatics\/btz162","article-title":"Bit-parallel sequence-to-graph alignment","volume":"35","author":"Rautiainen","year":"2019","journal-title":"Bioinformatics"},{"key":"2025091918573986300_btaf476-B35","doi-asserted-by":"crossref","first-page":"3363","DOI":"10.1093\/bioinformatics\/bth408","article-title":"Reducing storage requirements for biological sequence comparison","volume":"20","author":"Roberts","year":"2004","journal-title":"Bioinformatics"},{"key":"2025091918573986300_btaf476-B36","doi-asserted-by":"crossref","first-page":"260","DOI":"10.1186\/s13059-022-02831-7","article-title":"Strobealign: flexible seed size enables ultra-fast and accurate read alignment","volume":"23","author":"Sahlin","year":"2022","journal-title":"Genome Biol"},{"key":"2025091918573986300_btaf476-B37","doi-asserted-by":"crossref","first-page":"133","DOI":"10.1186\/s13059-023-02972-3","article-title":"A survey of mapping algorithms in the long-reads era","volume":"24","author":"Sahlin","year":"2023","journal-title":"Genome Biol"},{"key":"2025091918573986300_btaf476-B38","doi-asserted-by":"crossref","first-page":"243","DOI":"10.1038\/s41576-020-0210-7","article-title":"Pan-genomics in the human genome era","volume":"21","author":"Sherman","year":"2020","journal-title":"Nat Rev Genet"},{"first-page":"13","year":"2017","author":"Sir\u00e9n","key":"2025091918573986300_btaf476-B39"},{"key":"2025091918573986300_btaf476-B40","doi-asserted-by":"crossref","first-page":"375","DOI":"10.1109\/TCBB.2013.2297101","article-title":"Indexing graphs for path queries with applications in genome research","volume":"11","author":"Sir\u00e9n","year":"2014","journal-title":"IEEE\/ACM Trans Comput Biol Bioinform"},{"key":"2025091918573986300_btaf476-B41","doi-asserted-by":"crossref","first-page":"400","DOI":"10.1093\/bioinformatics\/btz575","article-title":"Haplotype-aware graph indexes","volume":"36","author":"Sir\u00e9n","year":"2020","journal-title":"Bioinformatics"},{"key":"2025091918573986300_btaf476-B42","doi-asserted-by":"crossref","first-page":"abg8871","DOI":"10.1126\/science.abg8871","article-title":"Pangenomics enables genotyping of known structural variants in 5202 diverse genomes","volume":"374","author":"Sir\u00e9n","year":"2021","journal-title":"Science"},{"key":"2025091918573986300_btaf476-B43","doi-asserted-by":"crossref","first-page":"87","DOI":"10.1186\/s12864-018-4465-8","article-title":"Towards pan-genome read alignment to improve variation calling","volume":"19","author":"Valenzuela","year":"2018","journal-title":"BMC Genomics"},{"author":"Vasimuddin","key":"2025091918573986300_btaf476-B44","first-page":"314"},{"key":"2025091918573986300_btaf476-B45","doi-asserted-by":"crossref","first-page":"257","DOI":"10.1186\/s12859-021-04162-z","article-title":"Accel-align: a fast sequence mapper and aligner based on the seed\u2013embed\u2013extend method","volume":"22","author":"Yan","year":"2021","journal-title":"BMC Bioinformatics"},{"author":"Yan","key":"2025091918573986300_btaf476-B46","first-page":"144"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaf476\/64149626\/btaf476.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/9\/btaf476\/64149626\/btaf476.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/9\/btaf476\/64149626\/btaf476.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,19]],"date-time":"2025-09-19T22:57:51Z","timestamp":1758322671000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btaf476\/8242760"}},"subtitle":[],"editor":[{"given":"Pier Luigi","family":"Martelli","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2025,8,28]]},"references-count":46,"journal-issue":{"issue":"9","published-print":{"date-parts":[[2025,9,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaf476","relation":{},"ISSN":["1367-4811"],"issn-type":[{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2025,9]]},"published":{"date-parts":[[2025,8,28]]},"article-number":"btaf476"}}