{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,11]],"date-time":"2026-03-11T00:23:13Z","timestamp":1773188593001,"version":"3.50.1"},"reference-count":24,"publisher":"Oxford University Press (OUP)","issue":"1","license":[{"start":{"date-parts":[[2024,12,23]],"date-time":"2024-12-23T00:00:00Z","timestamp":1734912000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,12,26]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Nanopore sequencing represents a significant advancement in genomics, enabling direct long-read DNA sequencing at the single-molecule level. Accurate simulation of nanopore sequencing signals from nucleotide sequences is crucial for method development and for complementing experimental data. Most existing approaches rely on predefined statistical models, which may not adequately capture the properties of experimental signal data. Furthermore, these simulators were developed for earlier versions of nanopore chemistry, which limits their applicability and adaptability to the latest flow cell data.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>To enhance the quality of artificial signals, we introduce seq2squiggle, a novel transformer-based, non-autoregressive model designed to generate nanopore sequencing signals from nucleotide sequences. Unlike existing simulators that rely on static k-mer models, our approach learns sequential contextual information from segmented signal data. We benchmark seq2squiggle against state-of-the-art simulators on real experimental R9.4.1 and R10.4.1 data, evaluating signal similarity, basecalling accuracy, and variant detection rates. Seq2squiggle consistently outperforms existing tools across multiple datasets, demonstrating superior similarity to real data and offering a robust solution for simulating nanopore sequencing signals with the latest flow cell generation.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>seq2squiggle is freely available on GitHub at: github.com\/ZKI-PH-ImageAnalysis\/seq2squiggle.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btae744","type":"journal-article","created":{"date-parts":[[2024,12,16]],"date-time":"2024-12-16T07:28:12Z","timestamp":1734334092000},"source":"Crossref","is-referenced-by-count":3,"title":["End-to-end simulation of nanopore sequencing signals with feed-forward transformers"],"prefix":"10.1093","volume":"41","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6346-7896","authenticated-orcid":false,"given":"Denis","family":"Beslic","sequence":"first","affiliation":[{"name":"Centre for Artificial Intelligence in Public Health Research, Robert Koch Institute , Berlin 13353,","place":["Germany"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7289-3915","authenticated-orcid":false,"given":"Martin","family":"Kucklick","sequence":"additional","affiliation":[{"name":"Institute for Microbiology, Technical University of Braunschweig , Braunschweig 38106,","place":["Germany"]},{"name":"Microbial Proteomics, Helmholtz Centre for Infection Research (HZI) , Braunschweig 38124,","place":["Germany"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1201-3488","authenticated-orcid":false,"given":"Susanne","family":"Engelmann","sequence":"additional","affiliation":[{"name":"Institute for Microbiology, Technical University of Braunschweig , Braunschweig 38106,","place":["Germany"]},{"name":"Microbial Proteomics, Helmholtz Centre for Infection Research (HZI) , Braunschweig 38124,","place":["Germany"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1338-2699","authenticated-orcid":false,"given":"Stephan","family":"Fuchs","sequence":"additional","affiliation":[{"name":"Genome Competence Center, Robert Koch Institute, Berlin 13353, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4589-9809","authenticated-orcid":false,"given":"Bernhard Y","family":"Renard","sequence":"additional","affiliation":[{"name":"Data Analytics and Computational Statistics, Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam , Potsdam 14482,","place":["Germany"]}]},{"ORCID":"https:\/\/orcid.org\/0009-0003-2506-4297","authenticated-orcid":false,"given":"Nils","family":"K\u00f6rber","sequence":"additional","affiliation":[{"name":"Centre for Artificial Intelligence in Public Health Research, Robert Koch Institute , Berlin 13353,","place":["Germany"]}]}],"member":"286","published-online":{"date-parts":[[2024,12,23]]},"reference":[{"key":"2025011404263626900_btae744-B1","doi-asserted-by":"crossref","first-page":"1448","DOI":"10.1038\/s41467-024-45778-y","article-title":"A signal processing and deep learning framework for methylation detection using Oxford Nanopore sequencing","volume":"15","author":"Ahsan","year":"2024","journal-title":"Nat Commun"},{"key":"2025011404263626900_btae744-B2","doi-asserted-by":"crossref","first-page":"30","DOI":"10.1186\/s13059-020-1935-5","article-title":"Opportunities and challenges in long-read sequencing data analysis","volume":"21","author":"Amarasinghe","year":"2020","journal-title":"Genome Biol"},{"key":"2025011404263626900_btae744-B3","doi-asserted-by":"crossref","DOI":"10.1101\/076901","article-title":"SiLiCO: a simulator of long read sequencing in PacBio and Oxford nanopore genomics","author":"Baker","year":"2016"},{"key":"2025011404263626900_btae744-B4","doi-asserted-by":"crossref","first-page":"573","DOI":"10.1093\/nar\/27.2.573","article-title":"Tandem repeats finder: a program to analyze DNA sequences","volume":"27","author":"Benson","year":"1999","journal-title":"Nucleic Acids Res"},{"key":"2025011404263626900_btae744-B5","doi-asserted-by":"crossref","first-page":"7244","DOI":"10.3390\/s20247244","article-title":"Simulation of nanopore sequencing signals based on BiGRU","volume":"20","author":"Chen","year":"2020","journal-title":"Sensors"},{"key":"2025011404263626900_btae744-B6","doi-asserted-by":"crossref","first-page":"405","DOI":"10.1089\/cmb.2014.0029","article-title":"Joint variant and De novo mutation identification on pedigrees from high-throughput sequencing data","volume":"21","author":"Cleary","year":"2014","journal-title":"J Comput Biol"},{"key":"2025011404263626900_btae744-B7","doi-asserted-by":"crossref","first-page":"e0257521","DOI":"10.1371\/journal.pone.0257521","article-title":"Sequencing DNA with nanopores: troubles and biases","volume":"16","author":"Delahaye","year":"2021","journal-title":"PLoS One"},{"key":"2025011404263626900_btae744-B8","doi-asserted-by":"crossref","first-page":"1026","DOI":"10.1038\/s41587-021-01147-4","article-title":"Fast nanopore sequencing data analysis with SLOW5","volume":"40","author":"Gamaarachchi","year":"2022","journal-title":"Nat Biotechnol"},{"key":"2025011404263626900_btae744-B9","doi-asserted-by":"publisher","first-page":"778","DOI":"10.1101\/gr.278730.12338692839","article-title":"Simulation of nanopore sequencing signal data with tunable parameters","volume":"34","author":"Gamaarachchi","year":"2024","journal-title":"Genome Res"},{"key":"2025011404263626900_btae744-B10","doi-asserted-by":"publisher","author":"Giesselmann","year":"2021","DOI":"10.17169\/refubium-32662"},{"key":"2025011404263626900_btae744-B11","author":"Huang"},{"key":"2025011404263626900_btae744-B13","doi-asserted-by":"crossref","DOI":"10.1101\/2024.03.05.583511","article-title":"Uncalled4 improves nanopore DNA and RNA modification detection via fast and accurate signal alignment","author":"Kovaka","year":"2024"},{"key":"2025011404263626900_btae744-B14","doi-asserted-by":"crossref","first-page":"2899","DOI":"10.1093\/bioinformatics\/bty223","article-title":"DeepSimulator: a deep simulator for nanopore sequencing","volume":"34","author":"Li","year":"2018","journal-title":"Bioinformatics"},{"key":"2025011404263626900_btae744-B15","first-page":"812","author":"Liang"},{"key":"2025011404263626900_btae744-B16","doi-asserted-by":"crossref","first-page":"71","DOI":"10.1186\/s13059-023-02903-2","article-title":"Comprehensive benchmark and architectural analysis of deep learning models for nanopore sequencing basecalling","volume":"24","author":"Pag\u00e8s-Gallego","year":"2023","journal-title":"Genome Biol"},{"key":"2025011404263626900_btae744-B17","first-page":"569","author":"Pratzlich","year":"2016"},{"key":"2025011404263626900_btae744-B18","author":"Ren","year":"2022"},{"key":"2025011404263626900_btae744-B19","first-page":"3171","author":"Ren","year":"2019"},{"key":"2025011404263626900_btae744-B20","doi-asserted-by":"crossref","first-page":"btad352","DOI":"10.1093\/bioinformatics\/btad352","article-title":"Accelerated nanopore basecalling with SLOW5 data format","volume":"39","author":"Samarakoon","year":"2023","journal-title":"Bioinformatics"},{"key":"2025011404263626900_btae744-B21","doi-asserted-by":"crossref","first-page":"giy037","DOI":"10.1093\/gigascience\/giy037","article-title":"Chiron: translating nanopore raw signal directly into nucleotide sequence using deep learning","volume":"7","author":"Teng","year":"2018","journal-title":"Gigascience"},{"key":"2025011404263626900_btae744-B22","first-page":"462","author":"Tralie","year":"2020"},{"key":"2025011404263626900_btae744-B23","doi-asserted-by":"crossref","first-page":"1348","DOI":"10.1038\/s41587-021-01108-x","article-title":"Nanopore sequencing technology, bioinformatics and applications","volume":"39","author":"Wang","year":"2021","journal-title":"Nat Biotechnol"},{"key":"2025011404263626900_btae744-B24","author":"Zhang","year":"2023"},{"key":"2025011404263626900_btae744-B25","doi-asserted-by":"crossref","first-page":"797","DOI":"10.1038\/s43588-022-00387-x","article-title":"Symphonizing pileup and full-alignment for deep learning-based long-read variant calling","volume":"2","author":"Zheng","year":"2022","journal-title":"Nat Comput Sci"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btae744\/61257982\/btae744.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/1\/btae744\/61257982\/btae744.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/1\/btae744\/61257982\/btae744.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,1,13]],"date-time":"2025-01-13T23:27:06Z","timestamp":1736810826000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btae744\/7930676"}},"subtitle":[],"editor":[{"given":"Inanc","family":"Birol","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2024,12,23]]},"references-count":24,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2024,12,26]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btae744","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2024.08.12.607296","asserted-by":"object"}]},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2025,1]]},"published":{"date-parts":[[2024,12,23]]},"article-number":"btae744"}}