{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,17]],"date-time":"2026-02-17T14:38:01Z","timestamp":1771339081111,"version":"3.50.1"},"reference-count":34,"publisher":"Oxford University Press (OUP)","issue":"Supplement_1","license":[{"start":{"date-parts":[[2025,7,15]],"date-time":"2025-07-15T00:00:00Z","timestamp":1752537600000},"content-version":"vor","delay-in-days":14,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["DBI-2145171"],"award-info":[{"award-number":["DBI-2145171"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R01HG011065"],"award-info":[{"award-number":["R01HG011065"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>The established single-cell RNA sequencing (scRNA-seq) technologies has revolutionized biological and biomedical research by enabling the measurement of gene expression at single-cell resolution. However, the fundamental challenge of reconstructing full-length transcripts for individual cells remains unresolved. Existing single-sample assembly approaches cannot leverage shared information across cells while meta-assembly approaches often fail to strike a balance between consensus assembly and preserving cell-specific expression signatures.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We present Beaver, a cell-specific transcript assembler designed for short-read scRNA-seq data. Beaver implements a transcript fragment graph to organize individual assemblies and designs an efficient dynamic programming algorithm that searches for candidate full-length transcripts from the graph. Beaver incorporates two random forest models trained on 51 meticulously engineered features that accurately estimate the likelihood of each candidate transcript being expressed in individual cells. Our experiments, performed using both real and simulated Smart-seq3 scRNA-seq data, firmly show that Beaver substantially outperforms existing meta-assemblers and single-sample assemblers. At the same level of sensitivity, Beaver achieved 32.0%\u201364.6%, 13.5%\u201336.6%, and 9.8%\u201336.3% higher precision in average compared to meta-assemblers Aletsch, TransMeta, and PsiCLASS, respectively, with similar improvements over single-sample assemblers Scallop2 (10.1%\u201343.6%) and StringTie2 (24.3%\u201367.0%).<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>Beaver is freely available at https:\/\/github.com\/Shao-Group\/beaver. Scripts that reproduce the experimental results of this manuscript are available at https:\/\/github.com\/Shao-Group\/beaver-test.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaf236","type":"journal-article","created":{"date-parts":[[2025,7,15]],"date-time":"2025-07-15T13:02:40Z","timestamp":1752584560000},"page":"i323-i331","source":"Crossref","is-referenced-by-count":1,"title":["Transcriptome assembly at single-cell resolution with Beaver"],"prefix":"10.1093","volume":"41","author":[{"given":"Qian","family":"Shi","sequence":"first","affiliation":[{"name":"Department of Computer Science and Engineering, The Pennsylvania State University , Pennsylvania, PA 16802,","place":["United States"]}]},{"given":"Qimin","family":"Zhang","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, The Pennsylvania State University , Pennsylvania, PA 16802,","place":["United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6112-5139","authenticated-orcid":false,"given":"Mingfu","family":"Shao","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, The Pennsylvania State University , Pennsylvania, PA 16802,","place":["United States"]},{"name":"Huck Institutes of the Life Sciences, The Pennsylvania State University , Pennsylvania, PA 16802,","place":["United States"]}]}],"member":"286","published-online":{"date-parts":[[2025,7,15]]},"reference":[{"key":"2025071509023226000_btaf236-B1","doi-asserted-by":"crossref","first-page":"2529","DOI":"10.1093\/bioinformatics\/btt442","article-title":"MITIE: simultaneous RNA-Seq-based transcript identification and quantification in multiple samples","volume":"29","author":"Behr","year":"2013","journal-title":"Bioinformatics"},{"key":"2025071509023226000_btaf236-B2","doi-asserted-by":"crossref","first-page":"1197","DOI":"10.1038\/nbt.4259","article-title":"Single-cell isoform RNA sequencing characterizes isoforms in thousands of cerebellar cells","volume":"36","author":"Gupta","year":"2018","journal-title":"Nat Biotechnol"},{"key":"2025071509023226000_btaf236-B3","doi-asserted-by":"crossref","first-page":"708","DOI":"10.1038\/s41587-020-0497-0","article-title":"Single-cell RNA counting at allele and isoform resolution using Smart-seq3","volume":"38","author":"Hagemann-Jensen","year":"2020","journal-title":"Nat Biotechnol"},{"key":"2025071509023226000_btaf236-B4","doi-asserted-by":"crossref","first-page":"1452","DOI":"10.1038\/s41587-022-01311-4","article-title":"Scalable single-cell RNA sequencing from full transcripts with Smart-seq3xpress","volume":"40","author":"Hagemann-Jensen","year":"2022","journal-title":"Nat Biotechnol"},{"key":"2025071509023226000_btaf236-B5","doi-asserted-by":"crossref","first-page":"316","DOI":"10.1038\/s41592-022-01408-3","article-title":"Alevin-fry unlocks rapid, accurate and memory-frugal quantification of single-cell RNA-seq data","volume":"19","author":"He","year":"2022","journal-title":"Nat Methods"},{"key":"2025071509023226000_btaf236-B6","doi-asserted-by":"crossref","first-page":"7316","DOI":"10.1038\/s41467-024-51584-3","article-title":"Accurate long-read transcript discovery and quantification at single-cell, pseudo-bulk and bulk resolution with isosceles","volume":"15","author":"Kabza","year":"2024","journal-title":"Nat Commun"},{"key":"2025071509023226000_btaf236-B7","doi-asserted-by":"crossref","first-page":"278","DOI":"10.1186\/s13059-019-1910-1","article-title":"Transcriptome assembly from long-read RNA-seq alignments with StringTie2","volume":"20","author":"Kovaka","year":"2019","journal-title":"Genome Biol"},{"key":"2025071509023226000_btaf236-B8","doi-asserted-by":"crossref","first-page":"4025","DOI":"10.1038\/s41467-020-17800-6","article-title":"High throughput error corrected nanopore single cell transcriptome sequencing","volume":"11","author":"Lebrigand","year":"2020","journal-title":"Nat Commun"},{"key":"2025071509023226000_btaf236-B9","doi-asserted-by":"crossref","first-page":"323","DOI":"10.1186\/1471-2105-12-323","article-title":"RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome","volume":"12","author":"Li","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2025071509023226000_btaf236-B10","first-page":"178","author":"Lin","year":"2012"},{"key":"2025071509023226000_btaf236-B11","doi-asserted-by":"crossref","first-page":"4264","DOI":"10.1093\/bioinformatics\/btz240","article-title":"scRNAss: a single-cell RNA-seq assembler via imputing dropouts and combing junctions","volume":"35","author":"Liu","year":"2019","journal-title":"Bioinformatics"},{"key":"2025071509023226000_btaf236-B12","doi-asserted-by":"crossref","first-page":"1202","DOI":"10.1016\/j.cell.2015.05.002","article-title":"Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets","volume":"161","author":"Macosko","year":"2015","journal-title":"Cell"},{"key":"2025071509023226000_btaf236-B13","doi-asserted-by":"crossref","first-page":"68","DOI":"10.1038\/nmeth.4078","article-title":"TACO produces robust multisample transcriptome assemblies from RNA-seq","volume":"14","author":"Niknafs","year":"2017","journal-title":"Nat Methods"},{"key":"2025071509023226000_btaf236-B14","doi-asserted-by":"crossref","first-page":"1191","DOI":"10.1101\/gr.260174.119","article-title":"RNA-Bloom enables reference-free and reference-guided sequence assembly for single-cell transcriptomes","volume":"30","author":"Nip","year":"2020","journal-title":"Genome Res"},{"key":"2025071509023226000_btaf236-B15","doi-asserted-by":"crossref","first-page":"1346","DOI":"10.1038\/s41592-024-02298-3","article-title":"Systematic assessment of long-read RNA-seq methods for transcript identification and quantification","volume":"21","author":"Pardo-Palacios","year":"2024","journal-title":"Nat Methods"},{"key":"2025071509023226000_btaf236-B16","doi-asserted-by":"crossref","DOI":"10.12688\/f1000research.23297.1","article-title":"GFF utilities: GffRead and GffCompare","volume":"9","author":"Pertea","year":"2020","journal-title":"F1000Res"},{"key":"2025071509023226000_btaf236-B17","doi-asserted-by":"crossref","first-page":"290","DOI":"10.1038\/nbt.3122","article-title":"StringTie enables improved reconstruction of a transcriptome from RNA-seq reads","volume":"33","author":"Pertea","year":"2015","journal-title":"Nat Biotechnol"},{"key":"2025071509023226000_btaf236-B18","doi-asserted-by":"crossref","first-page":"171","DOI":"10.1038\/nprot.2014.006","article-title":"Full-length RNA-seq from single cells using Smart-seq2","volume":"9","author":"Picelli","year":"2014","journal-title":"Nat Protoc"},{"key":"2025071509023226000_btaf236-B19","doi-asserted-by":"crossref","first-page":"1167","DOI":"10.1038\/nbt.4020","article-title":"Accurate assembly of transcripts through phase-preserving graph decomposition","volume":"35","author":"Shao","year":"2017","journal-title":"Nat Biotechnol"},{"key":"2025071509023226000_btaf236-B20","doi-asserted-by":"crossref","first-page":"i307","DOI":"10.1093\/bioinformatics\/btae215","article-title":"Accurate assembly of multiple RNA-seq samples with Aletsch","volume":"40","author":"Shi","year":"2024","journal-title":"Bioinformatics"},{"key":"2025071509023226000_btaf236-B21","doi-asserted-by":"crossref","first-page":"3120","DOI":"10.1038\/s41467-019-11049-4","article-title":"High-throughput targeted long-read single cell sequencing reveals the clonal and transcriptional landscape of lymphocytes","volume":"10","author":"Singh","year":"2019","journal-title":"Nat Commun"},{"key":"2025071509023226000_btaf236-B22","doi-asserted-by":"crossref","first-page":"e98","DOI":"10.1093\/nar\/gkw158","article-title":"CLASS2: accurate and efficient splice variant annotation from RNA-seq reads","volume":"44","author":"Song","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2025071509023226000_btaf236-B23","doi-asserted-by":"crossref","first-page":"5000","DOI":"10.1038\/s41467-019-12990-0","article-title":"A multi-sample approach increases the accuracy of transcript assembly","volume":"10","author":"Song","year":"2019","journal-title":"Nat Commun"},{"key":"2025071509023226000_btaf236-B24","doi-asserted-by":"crossref","first-page":"587","DOI":"10.1038\/s41596-024-01057-0","article-title":"kallisto, bustools and kb-python for quantifying bulk, single-cell and single-nucleus RNA-seq","volume":"20","author":"Sullivan","year":"2024","journal-title":"Nat Protocols"},{"key":"2025071509023226000_btaf236-B25","doi-asserted-by":"crossref","first-page":"S15","DOI":"10.1186\/1471-2164-16-S2-S15","article-title":"Accurate inference of isoforms from multiple sample RNA-Seq data","volume":"16","author":"Tasnim","year":"2015","journal-title":"BMC Genomics"},{"key":"2025071509023226000_btaf236-B26","doi-asserted-by":"crossref","first-page":"310","DOI":"10.1186\/s13059-021-02525-6","article-title":"Comprehensive characterization of single-cell full-length isoforms in human and mouse with long-read sequencing","volume":"22","author":"Tian","year":"2021","journal-title":"Genome Biol"},{"key":"2025071509023226000_btaf236-B27","doi-asserted-by":"crossref","first-page":"511","DOI":"10.1038\/nbt.1621","article-title":"Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation","volume":"28","author":"Trapnell","year":"2010","journal-title":"Nat Biotechnol"},{"key":"2025071509023226000_btaf236-B28","doi-asserted-by":"crossref","first-page":"253","DOI":"10.1016\/j.gpb.2020.02.005","article-title":"Direct comparative analyses of 10\u00d7 genomics chromium and Smart-seq2","volume":"19","author":"Wang","year":"2021","journal-title":"Genomics Proteomics Bioinformatics"},{"key":"2025071509023226000_btaf236-B29","doi-asserted-by":"crossref","first-page":"191","DOI":"10.1186\/s13059-018-1571-5","article-title":"Simulation-based benchmarking of isoform quantification in single-cell RNA-seq","volume":"19","author":"Westoby","year":"2018","journal-title":"Genome Biol"},{"key":"2025071509023226000_btaf236-B30","doi-asserted-by":"crossref","first-page":"74","DOI":"10.1186\/s13059-020-01981-w","article-title":"Obstacles to detecting isoforms using full-length scRNA-seq data","volume":"21","author":"Westoby","year":"2020","journal-title":"Genome Biol"},{"key":"2025071509023226000_btaf236-B31","doi-asserted-by":"crossref","first-page":"1398","DOI":"10.1101\/gr.276434.121","article-title":"Transmeta simultaneously assembles multisample RNA-seq reads","volume":"32","author":"Yu","year":"2022","journal-title":"Genome Res"},{"key":"2025071509023226000_btaf236-B32","doi-asserted-by":"crossref","first-page":"e1011734","DOI":"10.1371\/journal.pcbi.1011734","article-title":"Transcript assembly and annotations: bias and adjustment","volume":"19","author":"Zhang","year":"2023","journal-title":"PLoS Comput Biol"},{"key":"2025071509023226000_btaf236-B33","doi-asserted-by":"crossref","first-page":"148","DOI":"10.1038\/s43588-022-00216-1","article-title":"Accurate assembly of multi-end RNA-seq data with Scallop2","volume":"2","author":"Zhang","year":"2022","journal-title":"Nat Comput Sci"},{"key":"2025071509023226000_btaf236-B34","doi-asserted-by":"crossref","first-page":"14049","DOI":"10.1038\/ncomms14049","article-title":"Massively parallel digital transcriptional profiling of single cells","volume":"8","author":"Zheng","year":"2017","journal-title":"Nat Commun"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/Supplement_1\/i323\/63745721\/btaf236.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/Supplement_1\/i323\/63745721\/btaf236.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,15]],"date-time":"2025-07-15T13:02:45Z","timestamp":1752584565000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/41\/Supplement_1\/i323\/8199411"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7,1]]},"references-count":34,"journal-issue":{"issue":"Supplement_1","published-print":{"date-parts":[[2025,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaf236","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2025,7]]},"published":{"date-parts":[[2025,7,1]]}}}