{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T11:44:13Z","timestamp":1753875853272,"version":"3.41.2"},"reference-count":33,"publisher":"Oxford University Press (OUP)","issue":"4","license":[{"start":{"date-parts":[[2025,4,3]],"date-time":"2025-04-03T00:00:00Z","timestamp":1743638400000},"content-version":"vor","delay-in-days":5,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["2118251"],"award-info":[{"award-number":["2118251"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Colgate University Research Council"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,3,29]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Targeted enrichment via capture probes, also known as baits, is a promising complementary procedure for next-generation sequencing methods. This technique uses short biotinylated oligonucleotide probes that hybridize with complementary genetic material in a sample. Following hybridization, the target fragments can be easily isolated and processed with minimal contamination from irrelevant material. Designing an efficient set of baits for a set of target sequences, however, is an NP-hard problem.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We develop a novel heuristic algorithm that leverages the similarities between the characteristics of the Minimum Bait Cover and the Closest String problems to reduce the number of baits to cover a given target sequence. Our results on real and synthetic datasets demonstrate that our algorithm, OLTA produces fewest baits for nearly all experimental settings and datasets. On average, it produces 6% and 11% fewer baits than the next best state-of-the-art methods for two major real datasets, AIV and MEGARES. Also, its bait set has the highest utilization and the minimum redundancy.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>Our algorithm is available at github.com\/FuelTheBurn\/OLTA-Optimizing-bait-seLection-for-TArgeted-sequencing. Test data and other software are archived at doi.org\/10.5281\/zenodo.15086636.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaf146","type":"journal-article","created":{"date-parts":[[2025,4,3]],"date-time":"2025-04-03T02:44:48Z","timestamp":1743648288000},"source":"Crossref","is-referenced-by-count":0,"title":["OLTA: Optimizing bait seLection for TArgeted sequencing"],"prefix":"10.1093","volume":"41","author":[{"ORCID":"https:\/\/orcid.org\/0009-0008-8492-2032","authenticated-orcid":false,"given":"Mete Orhun","family":"Minbay","sequence":"first","affiliation":[{"name":"Department of Computer Science, Colgate University , Hamilton, NY 13346,","place":["United States"]}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-8868-1378","authenticated-orcid":false,"given":"Richard","family":"Sun","sequence":"additional","affiliation":[{"name":"Computer and Information Science and Engineering Department, University of Florida , Gainesville, FL 32611,","place":["United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7987-248X","authenticated-orcid":false,"given":"Vijay","family":"Ramachandran","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Colgate University , Hamilton, NY 13346,","place":["United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0726-5461","authenticated-orcid":false,"given":"Ahmet","family":"Ay","sequence":"additional","affiliation":[{"name":"Departments of Biology and Mathematics, Colgate University , Hamilton, NY 13346,","place":["United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4403-8612","authenticated-orcid":false,"given":"Tamer","family":"Kahveci","sequence":"additional","affiliation":[{"name":"Computer and Information Science and Engineering Department, University of Florida , Gainesville, FL 32611,","place":["United States"]}]}],"member":"286","published-online":{"date-parts":[[2025,4,2]]},"reference":[{"key":"2025042616415219900_btaf146-B1","doi-asserted-by":"publisher","first-page":"i177","DOI":"10.1093\/bioinformatics\/btac226","article-title":"Syotti: scalable bait design for DNA enrichment","volume":"38","author":"Alanko","year":"2022","journal-title":"Bioinformatics"},{"key":"2025042616415219900_btaf146-B2","doi-asserted-by":"publisher","first-page":"1407","DOI":"10.3389\/fgene.2019.01407","article-title":"A guide to carrying out a phylogenomic target sequence capture project","volume":"10","author":"Andermann","year":"2019","journal-title":"Front Genet"},{"key":"2025042616415219900_btaf146-B3","doi-asserted-by":"publisher","first-page":"D744","DOI":"10.1093\/nar\/gkac1047","article-title":"MEGARes and AMR++, v3.0: An updated comprehensive database of antimicrobial resistance determinants and an improved software pipeline for classification using high-throughput sequencing","volume":"51","author":"Bonin","year":"2023","journal-title":"Nucleic Acids Res"},{"key":"2025042616415219900_btaf146-B4","doi-asserted-by":"publisher","first-page":"639","DOI":"10.1146\/annurev-micro-090817-062436","article-title":"Paleomicrobiology: Diagnosis and evolution of ancient pathogens","volume":"73","author":"Bos","year":"2019","journal-title":"Annu Rev Microbiol"},{"key":"2025042616415219900_btaf146-B5","doi-asserted-by":"publisher","first-page":"356","DOI":"10.1111\/1755-0998.12721","article-title":"BaitsTools: Software for hybridization capture bait design","volume":"18","author":"Campana","year":"2018","journal-title":"Mol Ecol Resour"},{"key":"2025042616415219900_btaf146-B6","doi-asserted-by":"publisher","first-page":"4293","DOI":"10.1093\/bioinformatics\/bty548","article-title":"MrBait: universal identification and design of targeted-enrichment capture probes","volume":"34","author":"Chafin","year":"2018","journal-title":"Bioinformatics"},{"year":"2025","key":"2025042616415219900_btaf146-B8716658"},{"key":"2025042616415219900_btaf146-B7","doi-asserted-by":"publisher","first-page":"100069","DOI":"10.1016\/j.crmeth.2021.100069","article-title":"Probe design for simultaneous, targeted capture of diverse metagenomic targets","volume":"1","author":"Dickson","year":"2021","journal-title":"Cell Rep Methods"},{"key":"2025042616415219900_btaf146-B8","doi-asserted-by":"publisher","first-page":"786","DOI":"10.1093\/bioinformatics\/btv646","article-title":"PHYLUCE is a software package for the analysis of conserved genomic loci","volume":"32","author":"Faircloth","year":"2016","journal-title":"Bioinformatics"},{"key":"2025042616415219900_btaf146-B9","doi-asserted-by":"publisher","first-page":"1195","DOI":"10.1093\/bioinformatics\/btm114","article-title":"A fast and flexible approach to oligonucleotide probe design for genomes and gene families","volume":"23","author":"Feng","year":"2007","journal-title":"Bioinformatics"},{"key":"2025042616415219900_btaf146-B10","doi-asserted-by":"publisher","first-page":"2924","DOI":"10.3389\/fmicb.2018.02924","article-title":"Hybrid capture-based next generation sequencing and its application to human infectious diseases","volume":"9","author":"Gaudin","year":"2018","journal-title":"Front Microbiol"},{"key":"2025042616415219900_btaf146-B11","doi-asserted-by":"publisher","first-page":"182","DOI":"10.1038\/nbt.1523","article-title":"Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing","volume":"27","author":"Gnirke","year":"2009","journal-title":"Nat Biotechnol"},{"key":"2025042616415219900_btaf146-B12","doi-asserted-by":"publisher","first-page":"919","DOI":"10.1128\/JCM.03050-15","article-title":"Depletion of human DNA in spiked clinical specimens for improvement of sensitivity of pathogen detection by next-generation sequencing","volume":"54","author":"Hasan","year":"2016","journal-title":"J Clin Microbiol"},{"key":"2025042616415219900_btaf146-B13","doi-asserted-by":"publisher","first-page":"4353","DOI":"10.1093\/bioinformatics\/btaa552","article-title":"AnthOligo: automating the design of oligonucleotides for capture\/enrichment technologies","volume":"36","author":"Jayaraman","year":"2020","journal-title":"Bioinformatics"},{"key":"2025042616415219900_btaf146-B14","doi-asserted-by":"publisher","first-page":"2105","DOI":"10.1111\/1755-0998.13598","article-title":"Fishing for DNA? designing baits for population genetics in target enrichment experiments: Guidelines, considerations and the new tool superbaits","volume":"22","author":"Jim\u00e9nez-Mena","year":"2022","journal-title":"Mol Ecol Resour"},{"key":"2025042616415219900_btaf146-B15","doi-asserted-by":"publisher","first-page":"e0007184","DOI":"10.1371\/journal.pntd.0007184","article-title":"Application of a targeted-enrichment methodology for full-genome sequencing of dengue 1-4, chikungunya and zika viruses directly from patient samples","volume":"13","author":"Kamaraj","year":"2019","journal-title":"PLoS Negl Trop Dis"},{"key":"2025042616415219900_btaf146-B16","doi-asserted-by":"publisher","first-page":"41","DOI":"10.1016\/S0890-5401(03)00057-9","article-title":"Distinguishing string selection problems","volume":"185","author":"Kevin Lanctot","year":"2003","journal-title":"Information and Computation"},{"key":"2025042616415219900_btaf146-B17","doi-asserted-by":"publisher","first-page":"579","DOI":"10.1186\/s12864-022-08790-4","article-title":"ProbeTools: designing hybridization probes for targeted genomic sequencing of diverse and hypervariable viral taxa","volume":"23","author":"Kuchinski","year":"2022","journal-title":"BMC Genomics"},{"key":"2025042616415219900_btaf146-B18","doi-asserted-by":"publisher","first-page":"65","DOI":"10.1186\/s12859-015-0501-8","article-title":"MetCap: A bioinformatics probe design pipeline for large-scale targeted metagenomics","volume":"16","author":"Kushwaha","year":"2015","journal-title":"BMC Bioinformatics"},{"key":"2025042616415219900_btaf146-B19","doi-asserted-by":"publisher","first-page":"1658","DOI":"10.1128\/JCM.01463-16","article-title":"Targeted enrichment for pathogen detection and characterization in three felid species","volume":"55","author":"Lee","year":"2017","journal-title":"J Clin Microbiol"},{"key":"2025042616415219900_btaf146-B20","doi-asserted-by":"publisher","first-page":"1875","DOI":"10.1093\/molbev\/msw056","article-title":"BaitFisher: A software package for multispecies target DNA enrichment probe design","volume":"33","author":"Mayer","year":"2016","journal-title":"Mol Biol Evol"},{"key":"2025042616415219900_btaf146-B21","doi-asserted-by":"publisher","first-page":"160","DOI":"10.1038\/s41587-018-0006-x","article-title":"Capturing sequence diversity in metagenomes with comprehensive and scalable probe design","volume":"37","author":"Metsky","year":"2019","journal-title":"Nat Biotechnol"},{"key":"2025042616415219900_btaf146-B22","doi-asserted-by":"publisher","first-page":"101707","DOI":"10.1016\/j.mcp.2021.101707","article-title":"Thermodynamic evaluation of the impact of DNA mismatches in PCR-type SARS-cov-2 primers and probes","volume":"56","author":"Miranda","year":"2021","journal-title":"Mol Cell Probes"},{"key":"2025042616415219900_btaf146-B23","doi-asserted-by":"publisher","first-page":"142","DOI":"10.1186\/s40168-017-0361-8","article-title":"Enrichment allows identification of diverse, rare elements in metagenomic resistome-virulome sequencing","volume":"5","author":"Noyes","year":"2017","journal-title":"Microbiome"},{"key":"2025042616415219900_btaf146-B24","doi-asserted-by":"publisher","first-page":"24645","DOI":"10.1038\/srep24645","article-title":"Characterization of the resistome in manure, soil and wastewater from dairy and beef production systems","volume":"6","author":"Noyes","year":"2016","journal-title":"Sci Rep"},{"key":"2025042616415219900_btaf146-B25","doi-asserted-by":"publisher","first-page":"e195","DOI":"10.1093\/nar\/gkq777","article-title":"Hybridization properties of long nucleic acid probes for detection of variable target sequences, and development of a hybridization prediction algorithm","volume":"38","author":"Ohrmalm","year":"2010","journal-title":"Nucleic Acids Res"},{"key":"2025042616415219900_btaf146-B26","doi-asserted-by":"publisher","first-page":"493","DOI":"10.3390\/cells12030493","article-title":"Targeted sequencing approach and its clinical applications for the molecular diagnosis of human diseases","volume":"12","author":"Pei","year":"2023","journal-title":"Cells"},{"key":"2025042616415219900_btaf146-B27","doi-asserted-by":"publisher","first-page":"275","DOI":"10.3390\/pathogens13040275","article-title":"Hybrid-capture target enrichment in human pathogens: Identification, evolution, biosurveillance, and genomic epidemiology","volume":"13","author":"Quek","year":"2024","journal-title":"Pathogens"},{"key":"2025042616415219900_btaf146-B28","doi-asserted-by":"publisher","first-page":"1539","DOI":"10.3390\/diagnostics12071539","article-title":"Target enrichment approaches for next-generation sequencing applications in oncology","volume":"12","author":"Singh","year":"2022","journal-title":"Diagnostics"},{"year":"2013","author":"Smit","key":"2025042616415219900_btaf146-B29"},{"key":"2025042616415219900_btaf146-B30","doi-asserted-by":"publisher","first-page":"323","DOI":"10.1038\/s41576-019-0119-1","article-title":"Ancient pathogen genomics as an\u00a0emerging tool for infectious disease\u00a0research","volume":"20","author":"Spyrou","year":"2019","journal-title":"Nat Rev Genet"},{"key":"2025042616415219900_btaf146-B31","doi-asserted-by":"publisher","first-page":"115869","DOI":"10.1109\/ACCESS.2022.3218003","article-title":"A heuristic solution to the closest string problem using wave function collapse techniques","volume":"10","author":"Xu","year":"2022","journal-title":"IEEE Access"},{"key":"2025042616415219900_btaf146-B32","doi-asserted-by":"publisher","first-page":"3463","DOI":"10.1128\/JCM.00273-11","article-title":"Unbiased parallel detection of viral pathogens in clinical samples by use of a metagenomic approach","volume":"49","author":"Yang","year":"2011","journal-title":"J Clin Microbiol"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaf146\/62845779\/btaf146.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/4\/btaf146\/62845779\/btaf146.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/4\/btaf146\/62845779\/btaf146.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,4,26]],"date-time":"2025-04-26T20:42:01Z","timestamp":1745700121000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btaf146\/8104295"}},"subtitle":[],"editor":[{"given":"Can","family":"Alkan","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2025,3,29]]},"references-count":33,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2025,3,29]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaf146","relation":{},"ISSN":["1367-4811"],"issn-type":[{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2025,4]]},"published":{"date-parts":[[2025,3,29]]},"article-number":"btaf146"}}