{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,17]],"date-time":"2025-10-17T00:17:39Z","timestamp":1760660259922,"version":"build-2065373602"},"reference-count":35,"publisher":"Oxford University Press (OUP)","issue":"10","license":[{"start":{"date-parts":[[2025,9,24]],"date-time":"2025-09-24T00:00:00Z","timestamp":1758672000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"SMARTEn"},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["CNS-1910193"],"award-info":[{"award-number":["CNS-1910193"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,10,2]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Oxford Nanopore Technologies\u2019 devices, such as MinION, permit affordable, real-time DNA sequencing, and come with targeted sequencing capabilities. Such capabilities create new challenges for metagenomic classifiers that must be computationally efficient yet robust enough to handle potentially erroneous DNA reads, while ideally inspecting only a few hundred bases of a read. Currently available DNA classifiers leave room for improvement with respect to classification accuracy, memory usage, and the ability to operate in targeted sequencing scenarios.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We present SKiM: Short K-mers in Metagenomics, a new lightweight metagenomic classifier designed for ONT reads. Compared to state-of-the-art classifiers, SKiM requires only a fraction of memory to run, and can classify DNA reads with higher accuracy after inspecting only their first few hundred bases. To achieve this, SKiM introduces new data compression techniques to maintain a reference database built from short k-mers, and treats classification as a statistical testing problem.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>SKiM source code, documentation, and test data are available from: https:\/\/gitlab.com\/SCoRe-Group\/skim.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaf537","type":"journal-article","created":{"date-parts":[[2025,9,24]],"date-time":"2025-09-24T13:58:18Z","timestamp":1758722298000},"source":"Crossref","is-referenced-by-count":0,"title":["SKiM: accurately classifying metagenomic ONT reads in limited memory"],"prefix":"10.1093","volume":"41","author":[{"ORCID":"https:\/\/orcid.org\/0009-0009-1299-2194","authenticated-orcid":false,"given":"Trevor","family":"Schneggenburger","sequence":"first","affiliation":[{"name":"Department of Computer Science and Engineering, University at Buffalo, Buffalo, NY 14260,","place":["United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1686-9697","authenticated-orcid":false,"given":"Jaroslaw","family":"Zola","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, University at Buffalo, Buffalo, NY 14260,","place":["United States"]}]}],"member":"286","published-online":{"date-parts":[[2025,9,24]]},"reference":[{"key":"2025101607391355800_btaf537-B1","doi-asserted-by":"crossref","first-page":"122","DOI":"10.1186\/s13059-023-02958-1","article-title":"SPUMONI 2: improved classification using a pangenome index of minimizer digests","volume":"24","author":"Ahmed","year":"2023","journal-title":"Genome Biol"},{"key":"2025101607391355800_btaf537-B2","doi-asserted-by":"crossref","first-page":"102696","DOI":"10.1016\/j.isci.2021.102696","article-title":"Pan-genomic matching statistics for targeted nanopore sequencing","volume":"24","author":"Ahmed","year":"2021","journal-title":"iScience"},{"key":"2025101607391355800_btaf537-B3","doi-asserted-by":"crossref","first-page":"265","DOI":"10.1186\/s13059-019-1875-0","article-title":"Dashing: fast and accurate genomic distances with HyperLogLog","volume":"20","author":"Baker","year":"2019","journal-title":"Genome Biol"},{"key":"2025101607391355800_btaf537-B4","doi-asserted-by":"crossref","first-page":"335","DOI":"10.1016\/S0022-0000(76)80045-1","article-title":"Testing for the consecutive ones property, interval graphs, and graph planarity using PQ-tree algorithms","volume":"13","author":"Booth","year":"1976","journal-title":"J Comput Syst Sci"},{"key":"2025101607391355800_btaf537-B5","doi-asserted-by":"crossref","first-page":"198","DOI":"10.1186\/s13059-018-1568-0","article-title":"KrakenUniq: confident and fast metagenomics classification using unique k-mer counts","volume":"19","author":"Breitwieser","year":"2018","journal-title":"Genome Biol"},{"key":"2025101607391355800_btaf537-B6","doi-asserted-by":"crossref","first-page":"709","DOI":"10.1002\/spe.2325","article-title":"Better bitmap performance with roaring bitmaps","volume":"46","author":"Chambi","year":"2016","journal-title":"Softw Pract Exp"},{"key":"2025101607391355800_btaf537-B7","doi-asserted-by":"crossref","first-page":"e10805","DOI":"10.7717\/peerj.10805","article-title":"Syncmers are more sensitive than minimizers for selecting conserved k-mers in biological sequences","volume":"9","author":"Edgar","year":"2021","journal-title":"PeerJ"},{"year":"2007","author":"Flajolet","key":"2025101607391355800_btaf537-B8"},{"first-page":"13","year":"2004","author":"Johnson","key":"2025101607391355800_btaf537-B9"},{"key":"2025101607391355800_btaf537-B10","doi-asserted-by":"crossref","first-page":"1721","DOI":"10.1101\/gr.210641.116","article-title":"Centrifuge: rapid and sensitive classification of metagenomic sequences","volume":"26","author":"Kim","year":"2016","journal-title":"Genome Res"},{"key":"2025101607391355800_btaf537-B11","doi-asserted-by":"crossref","first-page":"10991","DOI":"10.1038\/s41598-023-37134-9","article-title":"Metagenomic surveillance for bacterial tick-borne pathogens using nanopore adaptive sampling","volume":"13","author":"Kipp","year":"2023","journal-title":"Sci Rep"},{"first-page":"1","year":"2018","author":"Ko","key":"2025101607391355800_btaf537-B12"},{"key":"2025101607391355800_btaf537-B13","doi-asserted-by":"crossref","first-page":"6787","DOI":"10.3390\/s23156787","article-title":"Estimated nucleotide reconstruction quality symbols of basecalling tools for oxford nanopore sequencing","volume":"23","author":"Ku\u015bmirek","year":"2023","journal-title":"Sensors"},{"key":"2025101607391355800_btaf537-B14","doi-asserted-by":"crossref","first-page":"5125","DOI":"10.1038\/s41598-020-61989-x","article-title":"Benchmarking the MinION: evaluating long reads for microbial profiling","volume":"10","author":"Leidenfrost","year":"2020","journal-title":"Sci Rep"},{"key":"2025101607391355800_btaf537-B15","doi-asserted-by":"crossref","first-page":"3094","DOI":"10.1093\/bioinformatics\/bty191","article-title":"Minimap2: pairwise alignment for nucleotide sequences","volume":"34","author":"Li","year":"2018","journal-title":"Bioinformatics"},{"key":"2025101607391355800_btaf537-B16","doi-asserted-by":"crossref","first-page":"751","DOI":"10.1038\/nmeth.3930","article-title":"Real-time selective sequencing using nanopore technology","volume":"13","author":"Loose","year":"2016","journal-title":"Nat Methods"},{"key":"2025101607391355800_btaf537-B17","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1186\/s13059-021-02582-x","article-title":"Nanopore adaptive sampling: a tool for enrichment of low abundance species in metagenomic samples","volume":"23","author":"Martin","year":"2022","journal-title":"Genome Biol"},{"key":"2025101607391355800_btaf537-B18","doi-asserted-by":"crossref","first-page":"i66","DOI":"10.1093\/bioinformatics\/btad243","article-title":"Coriolis: enabling metagenomic classification on lightweight mobile devices","volume":"39","author":"Mikalsen","year":"2023","journal-title":"Bioinformatics"},{"key":"2025101607391355800_btaf537-B19","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9781316588284","volume-title":"Compact Data Structures: A Practical Approach","author":"Navarro","year":"2016"},{"key":"2025101607391355800_btaf537-B20","doi-asserted-by":"crossref","first-page":"giz043","DOI":"10.1093\/gigascience\/giz043","article-title":"Ultra-deep, long-read nanopore sequencing of mock microbial community standards","volume":"8","author":"Nicholls","year":"2019","journal-title":"Gigascience"},{"key":"2025101607391355800_btaf537-B21","doi-asserted-by":"crossref","first-page":"D733","DOI":"10.1093\/nar\/gkv1189","article-title":"Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation","volume":"44","author":"O\u2019Leary","year":"2016","journal-title":"Nucleic Acids Res"},{"year":"1985","author":"Olken","key":"2025101607391355800_btaf537-B22"},{"key":"2025101607391355800_btaf537-B23","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1186\/s12864-015-1419-2","article-title":"CLARK: fast and accurate classification of metagenomic and genomic sequences using discriminative k-mers","volume":"16","author":"Ounit","year":"2015","journal-title":"BMC Genomics"},{"key":"2025101607391355800_btaf537-B24","doi-asserted-by":"crossref","first-page":"442","DOI":"10.1038\/s41587-020-00746-x","article-title":"Readfish enables targeted nanopore sequencing of gigabase-sized genomes","volume":"39","author":"Payne","year":"2021","journal-title":"Nat Biotechnol"},{"key":"2025101607391355800_btaf537-B25","doi-asserted-by":"crossref","first-page":"228","DOI":"10.1038\/nature16996","article-title":"Real-time, portable genome sequencing for ebola surveillance","volume":"530","author":"Quick","year":"2016","journal-title":"Nature"},{"key":"2025101607391355800_btaf537-B26","doi-asserted-by":"crossref","first-page":"3363","DOI":"10.1093\/bioinformatics\/bth408","article-title":"Reducing storage requirements for biological sequence comparison","volume":"20","author":"Roberts","year":"2004","journal-title":"Bioinformatics"},{"first-page":"76","year":"2003","author":"Schleimer","key":"2025101607391355800_btaf537-B27"},{"key":"2025101607391355800_btaf537-B28","doi-asserted-by":"crossref","first-page":"285","DOI":"10.1038\/s41597-019-0287-z","article-title":"Shotgun metagenome data of a defined mock community using oxford nanopore, PacBio and Illumina technologies","volume":"6","author":"Sevim","year":"2019","journal-title":"Sci Data"},{"key":"2025101607391355800_btaf537-B29","doi-asserted-by":"crossref","first-page":"914","DOI":"10.1101\/gr.278623.123","article-title":"Fast and space-efficient taxonomic classification of long reads with hierarchical interleaved XOR filters","volume":"34","author":"Ulrich","year":"2024","journal-title":"Genome Res"},{"key":"2025101607391355800_btaf537-B30","doi-asserted-by":"crossref","first-page":"e00945-23","DOI":"10.1128\/msystems.00945-23","article-title":"Nanopore adaptive sampling effectively enriches bacterial plasmids","volume":"9","author":"Ulrich","year":"2024","journal-title":"mSystems"},{"key":"2025101607391355800_btaf537-B31","doi-asserted-by":"crossref","first-page":"i153","DOI":"10.1093\/bioinformatics\/btac223","article-title":"ReadBouncer: precise and scalable adaptive sampling for nanopore sequencing","volume":"38","author":"Ulrich","year":"2022","journal-title":"Bioinformatics"},{"key":"2025101607391355800_btaf537-B32","doi-asserted-by":"crossref","first-page":"1348","DOI":"10.1038\/s41587-021-01108-x","article-title":"Nanopore sequencing technology, bioinformatics and applications","volume":"39","author":"Wang","year":"2021","journal-title":"Nat Biotechnol"},{"key":"2025101607391355800_btaf537-B33","doi-asserted-by":"crossref","first-page":"1316","DOI":"10.21105\/joss.01316","article-title":"Badread: simulation of error-prone long reads","volume":"4","author":"Wick","year":"2019","journal-title":"J Open Source Softw"},{"key":"2025101607391355800_btaf537-B34","doi-asserted-by":"crossref","first-page":"257","DOI":"10.1186\/s13059-019-1891-0","article-title":"Improved metagenomic analysis with Kraken 2","volume":"20","author":"Wood","year":"2019","journal-title":"Genome Biol"},{"key":"2025101607391355800_btaf537-B35","doi-asserted-by":"crossref","first-page":"gigabyte103","DOI":"10.46471\/gigabyte.103","article-title":"Nanopore adaptive sampling enriches for antimicrobial resistance genes in microbial communities","volume":"2023","author":"Wrenn","year":"2023","journal-title":"GigaByte"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaf537\/64372244\/btaf537.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/10\/btaf537\/64372244\/btaf537.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/10\/btaf537\/64372244\/btaf537.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,16]],"date-time":"2025-10-16T11:39:30Z","timestamp":1760614770000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btaf537\/8262844"}},"subtitle":[],"editor":[{"given":"Can","family":"Alkan","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2025,9,24]]},"references-count":35,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2025,10,2]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaf537","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"type":"print","value":"1367-4803"},{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2025,10]]},"published":{"date-parts":[[2025,9,24]]},"article-number":"btaf537"}}