{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,7]],"date-time":"2026-02-07T00:33:42Z","timestamp":1770424422501,"version":"3.49.0"},"reference-count":20,"publisher":"Oxford University Press (OUP)","issue":"1","license":[{"start":{"date-parts":[[2025,1,9]],"date-time":"2025-01-09T00:00:00Z","timestamp":1736380800000},"content-version":"vor","delay-in-days":14,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100000324","name":"Gatsby Charitable Foundation","doi-asserted-by":"publisher","award":["PTAG\/022"],"award-info":[{"award-number":["PTAG\/022"]}],"id":[{"id":"10.13039\/501100000324","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,12,26]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Recent advancements in parallel sequencing methods have precipitated a surge in publicly available short-read sequence data. This has encouraged the development of novel computational tools for the de novo assembly of transcriptomes from RNA-seq data. Despite the availability of these tools, performing an end-to-end transcriptome assembly remains a programmatically involved task necessitating familiarity with best practices. Aside from quality control steps, including error correction, adapter trimming, and chimera filtration needing to be correctly used, moving data between programs often requires manual reformatting or restructuring, which can further impede throughput. Here, we introduce Semblans, a tool for streamlining the assembly process that efficiently and consistently produces high-quality transcriptome assemblies.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>Semblans abstracts the key quality control, reconstitution, and postprocessing steps of transcriptome assembly from raw short-read sequences to annotated coding sequences. Evaluating its performance against previously assembled transcriptomes on the basis of assembly quality, we find that Semblans produced higher quality assemblies for 98 of the 101 short-read runs tested.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>Semblans is written in C++ and runs on Unix-compliant operating systems. Source code, documentation, and compiled binaries are hosted under the GNU General Public License at https:\/\/github.com\/gladshire\/Semblans.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaf003","type":"journal-article","created":{"date-parts":[[2025,1,9]],"date-time":"2025-01-09T16:47:56Z","timestamp":1736441276000},"source":"Crossref","is-referenced-by-count":1,"title":["Semblans: automated assembly and processing of RNA-seq data"],"prefix":"10.1093","volume":"41","author":[{"given":"Miles D","family":"Woodcock-Girard","sequence":"first","affiliation":[{"name":"Department of Biological Sciences, University of Illinois at Chicago , Chicago, IL 60607,","place":["United States"]}]},{"given":"Eric C","family":"Bretz","sequence":"additional","affiliation":[{"name":"Department of Biological Sciences, University of Illinois at Chicago , Chicago, IL 60607,","place":["United States"]}]},{"given":"Holly M","family":"Robertson","sequence":"additional","affiliation":[{"name":"The Sainsbury Laboratory, University of Cambridge , Cambridge, CB2 1LR,","place":["United Kingdom"]},{"name":"Department of Genetics, University of Cambridge , Cambridge, CB2 3EJ,","place":["United Kingdom"]}]},{"given":"Karolis","family":"Ramanauskas","sequence":"additional","affiliation":[{"name":"Department of Biological Sciences, University of Illinois at Chicago , Chicago, IL 60607,","place":["United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1982-4416","authenticated-orcid":false,"given":"Jarrad T","family":"Hampton-Marcell","sequence":"additional","affiliation":[{"name":"Department of Biological Sciences, University of Illinois at Chicago , Chicago, IL 60607,","place":["United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2928-8899","authenticated-orcid":false,"given":"Joseph F","family":"Walker","sequence":"additional","affiliation":[{"name":"Department of Biological Sciences, University of Illinois at Chicago , Chicago, IL 60607,","place":["United States"]}]}],"member":"286","published-online":{"date-parts":[[2025,1,9]]},"reference":[{"key":"2025011821454114600_btaf003-B1","doi-asserted-by":"publisher","first-page":"173","DOI":"10.1145\/3569951.3597559","volume-title":"Practice and Experience in Advanced Research Computing 2023: Computing for the Common Good (PEARC '23). Association for Computing Machinery, New York, NY, USA,","author":"Boerner","year":"2023"},{"key":"2025011821454114600_btaf003-B2","doi-asserted-by":"crossref","first-page":"2114","DOI":"10.1093\/bioinformatics\/btu170","article-title":"Trimmomatic: a flexible trimmer for illumina sequence data","volume":"30","author":"Bolger","year":"2014","journal-title":"Bioinformatics"},{"key":"2025011821454114600_btaf003-B3","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1038\/nmeth.3176","article-title":"Fast and sensitive protein alignment using diamond","volume":"12","author":"Buchfink","year":"2015","journal-title":"Nat Methods"},{"key":"2025011821454114600_btaf003-B4","first-page":"1","article-title":"Corset: enabling differential gene expression analysis for de novo assembled transcriptomes","volume":"15","author":"Davidson","year":"2014","journal-title":"Genome Biol"},{"key":"2025011821454114600_btaf003-B5","doi-asserted-by":"crossref","first-page":"e1002195","DOI":"10.1371\/journal.pcbi.1002195","article-title":"Accelerated profile hmm searches","volume":"7","author":"Eddy","year":"2011","journal-title":"PLoS Comput Biol"},{"key":"2025011821454114600_btaf003-B6","doi-asserted-by":"crossref","first-page":"133","DOI":"10.1186\/s12859-023-05254-8","article-title":"transxpress: a snakemake pipeline for streamlined de novo transcriptome assembly and annotation","volume":"24","author":"Fallon","year":"2023","journal-title":"BMC Bioinformatics"},{"key":"2025011821454114600_btaf003-B7","doi-asserted-by":"crossref","first-page":"644","DOI":"10.1038\/nbt.1883","article-title":"Full-length transcriptome assembly from rna-seq data without a reference genome","volume":"29","author":"Grabherr","year":"2011","journal-title":"Nat Biotechnol"},{"key":"2025011821454114600_btaf003-B8","doi-asserted-by":"crossref","first-page":"e5428","DOI":"10.7717\/peerj.5428","article-title":"The oyster river protocol: a multi-assembler and kmer approach for de novo transcriptome assembly","volume":"6","author":"MacManes","year":"2018","journal-title":"PeerJ"},{"key":"2025011821454114600_btaf003-B9","doi-asserted-by":"crossref","first-page":"417","DOI":"10.1038\/nmeth.4197","article-title":"Salmon provides fast and bias-aware quantification of transcript expression","volume":"14","author":"Patro","year":"2017","journal-title":"Nat Methods"},{"key":"2025011821454114600_btaf003-B10","doi-asserted-by":"crossref","first-page":"e16456","DOI":"10.7717\/peerj.16456","article-title":"kakapo: easy extraction and annotation of genes from raw rna-seq reads","volume":"11","author":"Ramanauskas","year":"2023","journal-title":"PeerJ"},{"key":"2025011821454114600_btaf003-B11","doi-asserted-by":"crossref","first-page":"2199","DOI":"10.1093\/bioinformatics\/bty903","article-title":"Caars: comparative assembly and annotation of RNA-seq data","volume":"35","author":"Rey","year":"2019","journal-title":"Bioinformatics"},{"key":"2025011821454114600_btaf003-B12","doi-asserted-by":"crossref","first-page":"2070","DOI":"10.1111\/1755-0998.13593","article-title":"Transpi\u2014a comprehensive transcriptome analysis pipeline for de novo transcriptome assembly","volume":"22","author":"Rivera-Vic\u00e9ns","year":"2022","journal-title":"Mol Ecol Resour"},{"key":"2025011821454114600_btaf003-B13","doi-asserted-by":"crossref","first-page":"1134","DOI":"10.1101\/gr.196469.115","article-title":"Transrate: reference-free quality assessment of de novo transcriptome assemblies","volume":"26","author":"Smith-Unna","year":"2016","journal-title":"Genome Res"},{"key":"2025011821454114600_btaf003-B14","doi-asserted-by":"crossref","first-page":"48","DOI":"10.1186\/s13742-015-0089-y","article-title":"Rcorrector: efficient and accurate error correction for illumina RNA-seq reads","volume":"4","author":"Song","year":"2015","journal-title":"Gigascience"},{"key":"2025011821454114600_btaf003-B15","doi-asserted-by":"crossref","first-page":"2129","DOI":"10.1101\/gr.772403","article-title":"Panther: a library of protein families and subfamilies indexed by function","volume":"13","author":"Thomas","year":"2003","journal-title":"Genome Res"},{"key":"2025011821454114600_btaf003-B16","doi-asserted-by":"crossref","first-page":"446","DOI":"10.1002\/ajb2.1069","article-title":"From cacti to carnivores: improved phylotranscriptomic sampling and hierarchical homology inference provide further insight into the evolution of caryophyllales","volume":"105","author":"Walker","year":"2018","journal-title":"Am J Bot"},{"key":"2025011821454114600_btaf003-B17","doi-asserted-by":"crossref","first-page":"257","DOI":"10.1186\/s13059-019-1891-0","article-title":"Improved metagenomic analysis with kraken 2","volume":"20","author":"Wood","year":"2019","journal-title":"Genome Biol"},{"key":"2025011821454114600_btaf003-B18","doi-asserted-by":"crossref","first-page":"328","DOI":"10.1186\/1471-2164-14-328","article-title":"Optimizing de novo assembly of short-read rna-seq data for phylogenomics","volume":"14","author":"Yang","year":"2013","journal-title":"BMC Genomics"},{"key":"2025011821454114600_btaf003-B19","doi-asserted-by":"crossref","first-page":"3081","DOI":"10.1093\/molbev\/msu245","article-title":"Orthology inference in nonmodel organisms using transcriptomes and low-coverage genomes: improving accuracy and matrix occupancy for phylogenomics","volume":"31","author":"Yang","year":"2014","journal-title":"Mol Biol Evol"},{"key":"2025011821454114600_btaf003-B20","doi-asserted-by":"crossref","first-page":"679","DOI":"10.1038\/s41586-019-1693-2","article-title":"One thousand plant transcriptomes and the phylogenomics of green plants","volume":"574","author":"Zuntini","year":"2019","journal-title":"Nature"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaf003\/61396803\/btaf003.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/1\/btaf003\/61396803\/btaf003.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/1\/btaf003\/61396803\/btaf003.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,1,18]],"date-time":"2025-01-18T21:45:50Z","timestamp":1737236750000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btaf003\/7950665"}},"subtitle":[],"editor":[{"given":"Yann","family":"Ponty","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2024,12,26]]},"references-count":20,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2024,12,26]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaf003","relation":{},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2025,1]]},"published":{"date-parts":[[2024,12,26]]},"article-number":"btaf003"}}