{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T15:46:35Z","timestamp":1753890395555,"version":"3.41.2"},"reference-count":41,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2024,7,1]],"date-time":"2024-07-01T00:00:00Z","timestamp":1719792000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100003194","name":"Agent\u00fara Ministerstva \u0160kolstva, Vedy, V\u00fdskumu a \u0160portu SR","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100003194","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100005357","name":"Agent\u00fara na Podporu V\u00fdskumu a V\u00fdvoja","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100005357","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000051","name":"National Human Genome Research Institute","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000051","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000038","name":"Natural Sciences and Engineering Research Council of Canada","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100000038","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100018694","name":"HORIZON EUROPE Marie Sklodowska-Curie Actions","doi-asserted-by":"publisher","award":["872539"],"award-info":[{"award-number":["872539"]}],"id":[{"id":"10.13039\/100018694","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Bioinform."],"abstract":"<jats:p>We generalize a problem of finding maximum-scoring segment sets, previously studied by Cs\u0171r\u00f6s (IEEE\/ACM Transactions on Computational Biology and Bioinformatics, 2004, 1, 139\u2013150), from sequences to graphs. Namely, given a vertex-weighted graph <jats:italic>G<\/jats:italic> and a non-negative startup penalty <jats:italic>c<\/jats:italic>, we can find a set of vertex-disjoint paths in <jats:italic>G<\/jats:italic> with maximum total score when each path\u2019s score is its vertices\u2019 total weight minus <jats:italic>c<\/jats:italic>. We call this new problem <jats:italic>maximum-scoring path sets<\/jats:italic> (MSPS). We present an algorithm that has a linear-time complexity for graphs with a constant treewidth. Generalization from sequences to graphs allows the algorithm to be used on pangenome graphs representing several related genomes and can be seen as a common abstraction for several biological problems on pangenomes, including searching for CpG islands, ChIP-seq data analysis, analysis of region enrichment for functional elements, or simple chaining problems.<\/jats:p>","DOI":"10.3389\/fbinf.2024.1391086","type":"journal-article","created":{"date-parts":[[2024,7,1]],"date-time":"2024-07-01T04:46:52Z","timestamp":1719809212000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Maximum-scoring path sets on pangenome graphs of constant treewidth"],"prefix":"10.3389","volume":"4","author":[{"given":"Bro\u0148a","family":"Brejov\u00e1","sequence":"first","affiliation":[]},{"given":"Travis","family":"Gagie","sequence":"additional","affiliation":[]},{"given":"Eva","family":"Herencs\u00e1rov\u00e1","sequence":"additional","affiliation":[]},{"given":"Tom\u00e1\u0161","family":"Vina\u0159","sequence":"additional","affiliation":[]}],"member":"1965","published-online":{"date-parts":[[2024,7,1]]},"reference":[{"key":"B1","doi-asserted-by":"publisher","first-page":"277","DOI":"10.1137\/0608024","article-title":"Complexity of finding embeddings in a k-tree","volume":"8","author":"Arnborg","year":"1987","journal-title":"SIAM J. Algebraic Discrete Methods"},{"key":"B2","doi-asserted-by":"publisher","first-page":"308","DOI":"10.1016\/0196-6774(91)90006-k","article-title":"Easy problems for tree-decomposable graphs","volume":"12","author":"Arnborg","year":"1991","journal-title":"J. Algorithms"},{"volume-title":"Computing maximum-scoring segments optimally","year":"2007","author":"Bengtsson","key":"B3"},{"key":"B4","doi-asserted-by":"publisher","first-page":"1305","DOI":"10.1137\/s0097539793251219","article-title":"A linear time algorithm for finding tree-decompositions of small treewidth","volume":"25","author":"Bodlaender","year":"1996","journal-title":"SIAM J. Comput."},{"key":"B5","first-page":"19","article-title":"Treewidth: algorithmic techniques and results","volume-title":"International symposium on mathematical foundations of computer science","author":"Bodlaender","year":"1997"},{"key":"B6","doi-asserted-by":"crossref","DOI":"10.1137\/1.9780898719796","volume-title":"Graph classes: a survey","author":"Brandst\u00e4dt","year":"1999"},{"key":"B7","doi-asserted-by":"crossref","first-page":"359","DOI":"10.1137\/1.9781611977073.18","article-title":"Sparsifying, shrinking and splicing for minimum path cover in parameterized linear time","volume-title":"Proceedings of the 2022 annual ACM-SIAM symposium on discrete algorithms (SODA)","author":"C\u00e1ceres","year":"2022"},{"key":"B8","first-page":"58","article-title":"Sequence to graph alignment using gap-sensitive co-linear chaining","volume-title":"International conference on research in computational molecular biology (RECOMB2023)","author":"Chandra","year":"2023"},{"key":"B9","doi-asserted-by":"publisher","first-page":"i146","DOI":"10.1093\/bioinformatics\/btaa446","article-title":"Distance indexing and seed clustering in sequence graphs","volume":"36","author":"Chang","year":"2020","journal-title":"Bioinformatics"},{"key":"B10","doi-asserted-by":"publisher","first-page":"118","DOI":"10.1093\/bib\/bbw089","article-title":"Computational pan-genomics: status, promises and challenges","volume":"19","year":"2018","journal-title":"Briefings Bioinforma."},{"key":"B11","doi-asserted-by":"publisher","first-page":"453","DOI":"10.1186\/1471-2105-7-453","article-title":"REEF: searching REgionally Enriched Features in genomes","volume":"7","author":"Coppe","year":"2006","journal-title":"BMC Bioinforma."},{"key":"B12","doi-asserted-by":"publisher","first-page":"e15","DOI":"10.1093\/nar\/gku1196","article-title":"Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences using Gubbins","volume":"43","author":"Croucher","year":"2015","journal-title":"Nucleic Acids Res."},{"key":"B13","doi-asserted-by":"publisher","first-page":"139","DOI":"10.1109\/tcbb.2004.43","article-title":"Maximum-scoring segment sets","volume":"1","author":"Cs\u0171r\u00f6s","year":"2004","journal-title":"IEEE\/ACM Trans. Comput. Biol. Bioinforma."},{"key":"B14","doi-asserted-by":"publisher","first-page":"1010","DOI":"10.1101\/gad.2037511","article-title":"CpG islands and the regulation of transcription","volume":"25","author":"Deaton","year":"2011","journal-title":"Genes & Dev."},{"key":"B15","doi-asserted-by":"publisher","first-page":"2446","DOI":"10.1093\/bioinformatics\/btr404","article-title":"PREDA: an R-package to identify regional variations in genomic data","volume":"27","author":"Ferrari","year":"2011","journal-title":"Bioinformatics"},{"key":"B16","doi-asserted-by":"publisher","first-page":"875","DOI":"10.1038\/nbt.4227","article-title":"Variation graph toolkit improves read mapping by representing genetic variation in the reference","volume":"36","author":"Garrison","year":"2018","journal-title":"Nat. Biotechnol."},{"key":"B17","doi-asserted-by":"crossref","first-page":"196","DOI":"10.1007\/978-3-319-19929-0_17","article-title":"Encodings of range maximum-sum segment queries and applications","volume-title":"Combinatorial pattern matching (CPM)","author":"Gawrychowski","year":"2015"},{"key":"B18","doi-asserted-by":"publisher","first-page":"437","DOI":"10.1007\/s10878-019-00488-w","article-title":"Nontrivial path covers of graphs: existence, minimization and maximization","volume":"39","author":"G\u00f3mez","year":"2020","journal-title":"J. Comb. Optim."},{"key":"B19","doi-asserted-by":"publisher","first-page":"e1006731","DOI":"10.1371\/journal.pcbi.1006731","article-title":"Graph peak caller: calling ChIP-seq peaks on graph-based reference genomes","volume":"15","author":"Grytten","year":"2019","journal-title":"PLoS Comput. Biol."},{"key":"B20","doi-asserted-by":"publisher","first-page":"3018","DOI":"10.1038\/s41467-019-11023-0","article-title":"A genome-wide scan statistic framework for whole-genome sequence data analysis","volume":"10","author":"He","year":"2019","journal-title":"Nat. Commun."},{"key":"B21","doi-asserted-by":"publisher","first-page":"S181","DOI":"10.1093\/bioinformatics\/18.suppl_1.s181","article-title":"Splicing graphs and EST assembly problem","volume":"18","author":"Heber","year":"2002","journal-title":"Bioinformatics"},{"key":"B22","first-page":"232","article-title":"Identifying clusters in graph representations of genomes","volume":"3498","author":"Herencs\u00e1rov\u00e1","year":"2023","journal-title":"Proc. 23rd Conf. Inf. Technol. \u2013 Appl. Theory (ITAT 2023)"},{"key":"B23","doi-asserted-by":"publisher","first-page":"104616","DOI":"10.1016\/j.ic.2020.104616","article-title":"Efficient pattern matching in elastic-degenerate strings","volume":"279","author":"Iliopoulos","year":"2021","journal-title":"Inf. Comput."},{"key":"B24","doi-asserted-by":"publisher","first-page":"226","DOI":"10.1038\/ng.1028","article-title":"De novo assembly and genotyping of variants using colored de Bruijn graphs","volume":"44","author":"Iqbal","year":"2012","journal-title":"Nat. Genet."},{"key":"B25","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1007\/978-1-4612-1578-3_14","article-title":"Spatial scan statistics: models, calculations, and applications","volume-title":"Scan statistics and applications","author":"Kulldorff","year":"1999"},{"key":"B26","doi-asserted-by":"publisher","first-page":"265","DOI":"10.1186\/s13059-020-02168-z","article-title":"The design and construction of reference pangenome graphs with minigraph","volume":"21","author":"Li","year":"2020","journal-title":"Genome Biol."},{"key":"B27","doi-asserted-by":"publisher","first-page":"491","DOI":"10.1016\/s0097-8485(02)00010-4","article-title":"Applications of recursive segmentation to the analysis of DNA sequences","volume":"26","author":"Li","year":"2002","journal-title":"Comput. Chem."},{"key":"B28","doi-asserted-by":"publisher","first-page":"btad460","DOI":"10.1093\/bioinformatics\/btad460","article-title":"Chaining for accurate alignment of erroneous long reads to acyclic variation graphs","volume":"39","author":"Ma","year":"2023","journal-title":"Bioinformatics"},{"key":"B29","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3301312","article-title":"Sparse dynamic programming on DAGs with small width","volume":"15","author":"M\u00e4kinen","year":"2019","journal-title":"ACM Trans. Algorithms (TALG)"},{"key":"B30","doi-asserted-by":"publisher","first-page":"8","DOI":"10.1186\/s13015-022-00213-z","article-title":"Tree diet: reducing the treewidth to unlock FPT algorithms in RNA bioinformatics","volume":"17","author":"Marchand","year":"2022","journal-title":"Algorithms Mol. Biol."},{"key":"B31","doi-asserted-by":"publisher","first-page":"469","DOI":"10.1089\/cmb.2010.0252","article-title":"Cactus graphs for genome comparisons","volume":"18","author":"Paten","year":"2011","journal-title":"J. Comput. Biol."},{"key":"B32","doi-asserted-by":"publisher","first-page":"1814","DOI":"10.1101\/gr.076554.108","article-title":"Enredo and Pecan: genome-wide mammalian consistency-based multiple alignment with paralogs","volume":"18","author":"Paten","year":"2008","journal-title":"Genome Res."},{"key":"B33","doi-asserted-by":"publisher","first-page":"1786","DOI":"10.1101\/gr.2395204","article-title":"De novo repeat classification and fragment assembly","volume":"14","author":"Pevzner","year":"2004","journal-title":"Genome Res."},{"key":"B34","doi-asserted-by":"publisher","first-page":"4","DOI":"10.1186\/s13015-024-00250-w","article-title":"Co-linear chaining on pangenome graphs","volume":"19","author":"Rajput","year":"2024","journal-title":"Algorithms Mol. Biol."},{"key":"B35","doi-asserted-by":"crossref","first-page":"353","DOI":"10.1007\/978-3-031-43980-3_29","article-title":"Chaining of maximal exact matches in graphs","volume-title":"International symposium on string processing and information retrieval","author":"Rizzo","year":"2023"},{"key":"B36","doi-asserted-by":"publisher","first-page":"309","DOI":"10.1016\/0196-6774(86)90023-4","article-title":"Graph minors. II. Algorithmic aspects of tree-width","volume":"7","author":"Robertson","year":"1986","journal-title":"J. Algorithms"},{"key":"B37","doi-asserted-by":"publisher","first-page":"15","DOI":"10.1186\/s13015-022-00216-w","article-title":"Treewidth-based algorithms for the small parsimony problem on networks","volume":"17","author":"Scornavacca","year":"2022","journal-title":"Algorithms Mol. Biol."},{"key":"B38","doi-asserted-by":"publisher","first-page":"3899","DOI":"10.1093\/nar\/27.19.3899","article-title":"Comparison of five methods for finding conserved sequences in multiple alignments of gene regulatory regions","volume":"27","author":"Stojanovic","year":"1999","journal-title":"Nucleic Acids Res."},{"key":"B39","doi-asserted-by":"publisher","first-page":"1345","DOI":"10.1109\/tcbb.2015.2418753","article-title":"Explaining a weighted DAG with few paths for solving genome-guided multi-assembly","volume":"12","author":"Tomescu","year":"2015","journal-title":"IEEE\/ACM Trans. Comput. Biol. Bioinforma."},{"key":"B40","doi-asserted-by":"publisher","first-page":"298","DOI":"10.1137\/0211023","article-title":"The recognition of series parallel digraphs","volume":"11","author":"Valdes","year":"1982","journal-title":"SIAM J. Comput."},{"key":"B41","doi-asserted-by":"publisher","first-page":"535","DOI":"10.1186\/s12859-022-05090-2","article-title":"Detecting clusters of transcription factors based on a nonhomogeneous Poisson process model","volume":"23","author":"Wu","year":"2022","journal-title":"BMC Bioinforma."}],"container-title":["Frontiers in Bioinformatics"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fbinf.2024.1391086\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,1]],"date-time":"2024-07-01T04:47:00Z","timestamp":1719809220000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fbinf.2024.1391086\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,7,1]]},"references-count":41,"alternative-id":["10.3389\/fbinf.2024.1391086"],"URL":"https:\/\/doi.org\/10.3389\/fbinf.2024.1391086","relation":{},"ISSN":["2673-7647"],"issn-type":[{"type":"electronic","value":"2673-7647"}],"subject":[],"published":{"date-parts":[[2024,7,1]]},"article-number":"1391086"}}