{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,12]],"date-time":"2026-04-12T12:54:09Z","timestamp":1775998449136,"version":"3.50.1"},"reference-count":23,"publisher":"Oxford University Press (OUP)","issue":"2","license":[{"start":{"date-parts":[[2024,1,26]],"date-time":"2024-01-26T00:00:00Z","timestamp":1706227200000},"content-version":"vor","delay-in-days":1,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Science and Engineering Council of Canada","award":["RGPIN-05952"],"award-info":[{"award-number":["RGPIN-05952"]}]},{"name":"National Science and Engineering Council of Canada","award":["RGPIN-03986"],"award-info":[{"award-number":["RGPIN-03986"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,2,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Transcriptomic long-read (LR) sequencing is an increasingly cost-effective technology for probing various RNA features. Numerous tools have been developed to tackle various transcriptomic sequencing tasks (e.g. isoform and gene fusion detection). However, the lack of abundant gold-standard datasets hinders the benchmarking of such tools. Therefore, the simulation of LR sequencing is an important and practical alternative. While the existing LR simulators aim to imitate the sequencing machine noise and to target specific library protocols, they lack some important library preparation steps (e.g. PCR) and are difficult to modify to new and changing library preparation techniques (e.g. single-cell LRs).<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We present TKSM, a modular and scalable LR simulator, designed so that each RNA modification step is targeted explicitly by a specific module. This allows the user to assemble a simulation pipeline as a combination of TKSM modules to emulate a specific sequencing design. Additionally, the input\/output of all the core modules of TKSM follows the same simple format (Molecule Description Format) allowing the user to easily extend TKSM with new modules targeting new library preparation steps.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>TKSM is available as an open source software at https:\/\/github.com\/vpc-ccg\/tksm.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btae051","type":"journal-article","created":{"date-parts":[[2024,1,23]],"date-time":"2024-01-23T15:42:38Z","timestamp":1706024558000},"source":"Crossref","is-referenced-by-count":6,"title":["TKSM: highly modular, user-customizable, and scalable transcriptomic sequencing long-read simulator"],"prefix":"10.1093","volume":"40","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8505-9241","authenticated-orcid":false,"given":"Fatih","family":"Karao\u011flano\u011flu","sequence":"first","affiliation":[{"name":"Computing Science Department, Simon Fraser University , Burnaby, BC V5A 1S6, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1294-9830","authenticated-orcid":false,"given":"Baraa","family":"Orabi","sequence":"additional","affiliation":[{"name":"Department of Computer Science, the University of British Columbia , Vancouver, BC V6T 1Z4, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6964-7521","authenticated-orcid":false,"given":"Ryan","family":"Flannigan","sequence":"additional","affiliation":[{"name":"Department of Urologic Sciences, the University of British Columbia , Vancouver, BC V5Z 1M9, Canada"},{"name":"Vancouver Prostate Centre , Vancouver, BC V6H 3Z6, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9837-1878","authenticated-orcid":false,"given":"Cedric","family":"Chauve","sequence":"additional","affiliation":[{"name":"Department of Mathematics, Simon Fraser University , Burnaby, BC V5A 1S6, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1143-0172","authenticated-orcid":false,"given":"Faraz","family":"Hach","sequence":"additional","affiliation":[{"name":"Department of Computer Science, the University of British Columbia , Vancouver, BC V6T 1Z4, Canada"},{"name":"Department of Urologic Sciences, the University of British Columbia , Vancouver, BC V5Z 1M9, Canada"},{"name":"Vancouver Prostate Centre , Vancouver, BC V6H 3Z6, Canada"}]}],"member":"286","published-online":{"date-parts":[[2024,1,25]]},"reference":[{"key":"2024020805380801600_btae051-B1","doi-asserted-by":"crossref","first-page":"30","DOI":"10.1186\/s13059-020-1935-5","article-title":"Opportunities and challenges in long-read sequencing data analysis","volume":"21","author":"Amarasinghe","year":"2020","journal-title":"Genome Biol"},{"key":"2024020805380801600_btae051-B2","doi-asserted-by":"crossref","DOI":"10.1093\/gigascience\/giab003","article-title":"long-read-tools.org: an interactive catalogue of analysis methods for long-read sequencing data","volume":"10","author":"Amarasinghe","year":"2021","journal-title":"GigaScience"},{"key":"2024020805380801600_btae051-B3","article-title":"A systematic benchmark of nanopore long read RNA sequencing for transcript level analysis in human cell lines","author":"Chen","year":"2021","journal-title":"bioRxiv"},{"key":"2024020805380801600_btae051-B4","doi-asserted-by":"crossref","first-page":"104530","DOI":"10.1016\/j.isci.2022.104530","article-title":"Fast and accurate matching of cellular barcodes across short-reads and long-reads of single-cell RNA-seq experiments","volume":"25","author":"Ebrahimi","year":"2022","journal-title":"iScience"},{"key":"2024020805380801600_btae051-B5","doi-asserted-by":"crossref","first-page":"1197","DOI":"10.1038\/nbt.4259","article-title":"Single-cell isoform RNA sequencing characterizes isoforms in thousands of cerebellar cells","volume":"36","author":"Gupta","year":"2018","journal-title":"Nat Biotechnol"},{"key":"2024020805380801600_btae051-B6","doi-asserted-by":"crossref","DOI":"10.1093\/gigascience\/giaa061","article-title":"Trans-NanoSim characterizes and simulates nanopore RNA-sequencing data","volume":"9","author":"Hafezqorani","year":"2020","journal-title":"GigaScience"},{"key":"2024020805380801600_btae051-B7","doi-asserted-by":"crossref","first-page":"182","DOI":"10.1186\/s13059-021-02399-8","article-title":"LIQA: long-read isoform quantification and analysis","volume":"22","author":"Hu","year":"2021","journal-title":"Genome Biol"},{"key":"2024020805380801600_btae051-B8","doi-asserted-by":"crossref","first-page":"129","DOI":"10.1186\/s12864-022-08339-5","article-title":"Genion, an accurate tool to detect gene fusion from long transcriptomics reads","volume":"23","author":"Karaoglanoglu","year":"2022","journal-title":"BMC Genomics"},{"key":"2024020805380801600_btae051-B9","doi-asserted-by":"crossref","first-page":"278","DOI":"10.1186\/s13059-019-1910-1","article-title":"Transcriptome assembly from long-read RNA-seq alignments with StringTie2","volume":"20","author":"Kovaka","year":"2019","journal-title":"Genome Biol"},{"key":"2024020805380801600_btae051-B10","doi-asserted-by":"crossref","first-page":"2578","DOI":"10.1093\/bioinformatics\/btz963","article-title":"DeepSimulator1.5: a more powerful, quicker and lighter simulator for nanopore sequencing","volume":"36","author":"Li","year":"2020","journal-title":"Bioinformatics"},{"key":"2024020805380801600_btae051-B11","doi-asserted-by":"crossref","first-page":"793","DOI":"10.1186\/s12864-020-07207-4","article-title":"LongGF: computational algorithm and software tool for fast and accurate detection of gene fusions by long-read transcriptome sequencing","volume":"21","author":"Liu","year":"2020","journal-title":"BMC Genomics"},{"key":"2024020805380801600_btae051-B12","article-title":"SQANTI-SIM: a simulator of controlled transcript novelty for lrRNA-seq benchmark","author":"Mestre-Tom\u00e1s","journal-title":"bioRxiv"},{"key":"2024020805380801600_btae051-B13","doi-asserted-by":"crossref","first-page":"33","DOI":"10.12688\/f1000research.29032.2","article-title":"Sustainable data analysis with Snakemake","volume":"10","author":"M\u00f6lder","year":"2021","journal-title":"F1000Res"},{"key":"2024020805380801600_btae051-B14","article-title":"Icarust, a real-time simulator for Oxford Nanopore adaptive sampling","author":"Munro","journal-title":"bioRxiv"},{"key":"2024020805380801600_btae051-B15","article-title":"PBSIM3: a simulator for all types of PacBio and ONT long reads","volume":"4","author":"Ono","year":"2022","journal-title":"NAR Genom Bioinform"},{"key":"2024020805380801600_btae051-B16","doi-asserted-by":"crossref","first-page":"e11","DOI":"10.1093\/nar\/gkac1112","article-title":"Freddie: annotation-independent detection and discovery of transcriptomic alternative splicing isoforms using long-read sequencing","volume":"51","author":"Orabi","year":"2023","journal-title":"Nucleic Acids Res"},{"key":"2024020805380801600_btae051-B17","doi-asserted-by":"crossref","first-page":"3120","DOI":"10.1038\/s41467-019-11049-4","article-title":"High-throughput targeted long-read single cell sequencing reveals the clonal and transcriptional landscape of lymphocytes","volume":"10","author":"Singh","year":"2019","journal-title":"Nat Commun"},{"key":"2024020805380801600_btae051-B18","doi-asserted-by":"crossref","first-page":"1438","DOI":"10.1038\/s41467-020-15171-6","article-title":"Full-length transcript characterization of SF3B1 mutation in chronic lymphocytic leukemia reveals downregulation of retained introns","volume":"11","author":"Tang","year":"2020","journal-title":"Nat Commun"},{"key":"2024020805380801600_btae051-B19","doi-asserted-by":"crossref","first-page":"310","DOI":"10.1186\/s13059-021-02525-6","article-title":"Comprehensive characterization of single-cell full-length isoforms in human and mouse with long-read sequencing","volume":"22","author":"Tian","year":"2021","journal-title":"Genome Biol"},{"key":"2024020805380801600_btae051-B20","doi-asserted-by":"crossref","first-page":"1316","DOI":"10.21105\/joss.01316","article-title":"Badread: simulation of error-prone long reads","volume":"4","author":"Wick","year":"2019","journal-title":"JOSS"},{"key":"2024020805380801600_btae051-B21","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1093\/gigascience\/gix010","article-title":"NanoSim: nanopore sequence read simulator based on statistical characterization","volume":"6","author":"Yang","year":"2017","journal-title":"GigaScience"},{"key":"2024020805380801600_btae051-B22","doi-asserted-by":"crossref","first-page":"giad013","DOI":"10.1093\/gigascience\/giad013","article-title":"Characterization and simulation of metagenomic nanopore sequencing data with Meta-Nanosim","volume":"12","author":"Yang","year":"2023","journal-title":"GigaScience"},{"key":"2024020805380801600_btae051-B23","doi-asserted-by":"crossref","first-page":"66","DOI":"10.1186\/s13059-023-02907-y","article-title":"Identification of cell barcodes from long-read single-cell RNA-seq with BLAZE","volume":"24","author":"You","year":"2023","journal-title":"Genome Biol"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btae051\/56423245\/btae051.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/2\/btae051\/56619418\/btae051.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/2\/btae051\/56619418\/btae051.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,8]],"date-time":"2024-02-08T01:03:17Z","timestamp":1707354197000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btae051\/7589926"}},"subtitle":[],"editor":[{"given":"Anthony","family":"Mathelier","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2024,1,25]]},"references-count":23,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2024,2,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btae051","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2023.06.12.544410","asserted-by":"object"}]},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,2,1]]},"published":{"date-parts":[[2024,1,25]]},"article-number":"btae051"}}