{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,9]],"date-time":"2026-03-09T14:31:19Z","timestamp":1773066679039,"version":"3.50.1"},"reference-count":25,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2026,3,1]],"date-time":"2026-03-01T00:00:00Z","timestamp":1772323200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2026,3,9]],"date-time":"2026-03-09T00:00:00Z","timestamp":1773014400000},"content-version":"vor","delay-in-days":8,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Mach Learn"],"published-print":{"date-parts":[[2026,3]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Nanopore sequencing offers the ability for real-time analysis of long DNA sequences at a low cost, enabling new applications such as early detection of cancer. Due to the complex nature of nanopore measurements and the high cost of obtaining ground-truth datasets, in-silico nanopore simulators play an important role in this field. Existing simulators rely on hand-crafted rules and parameters and do not learn an internal representation that would allow for analyzing underlying biological factors of interest. In this work, we investigate and extend Variational Autoregressive DNA-conditioned Autoencoder (VADA), a purely data-driven method for simulating nanopore signals based on an autoregressive latent variable model. We show that VADA can effectively model DNA-conditioned probability distributions over nanopore current sequences to produce varying current observations. We show that improving the flexibility of the conditional prior with conditional Real NVP flow significantly improves the simulation quality of VADA, a model that we refer to as VADA Normalizing Flow. We empirically demonstrate that our model achieves competitive simulation performance on experimental nanopore data. Moreover, we show that our model learns an informative latent representation that is predictive of the DNA labels. We hypothesize that other biological factors of interest, beyond DNA labels, can potentially be extracted from such a learned latent representation.<\/jats:p>","DOI":"10.1007\/s10994-026-07001-5","type":"journal-article","created":{"date-parts":[[2026,3,9]],"date-time":"2026-03-09T10:13:30Z","timestamp":1773051210000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["VADA-NF: Improved Data-Driven Simulation of Nanopore Sequencing"],"prefix":"10.1007","volume":"115","author":[{"given":"Simon Martinus","family":"Koop","sequence":"first","affiliation":[]},{"given":"Mohammadmahdi","family":"Mehmanchi","sequence":"additional","affiliation":[]},{"given":"Jonas","family":"Niederle","sequence":"additional","affiliation":[]},{"given":"Marc","family":"Pag\u00e8s-Gallego","sequence":"additional","affiliation":[]},{"given":"Vlado","family":"Menkovski","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2026,3,9]]},"reference":[{"key":"7001_CR1","unstructured":"Ba, J. L., Kiros, J. R., & Hinton, G. E. (2016). Layer normalization."},{"issue":"24","key":"7001_CR2","doi-asserted-by":"publisher","first-page":"7244","DOI":"10.3390\/S20247244","volume":"20","author":"W Chen","year":"2020","unstructured":"Chen, W., Zhang, P., Song, L., Yang, J., & Han, C. (2020). Simulation of nanopore sequencing signals based on BiGRU. Sensors, 20(24), 7244. https:\/\/doi.org\/10.3390\/S20247244","journal-title":"Sensors"},{"key":"7001_CR3","doi-asserted-by":"publisher","first-page":"518","DOI":"10.1038\/nbt.3423","volume":"34","author":"D Deamer","year":"2016","unstructured":"Deamer, D., Akeson, M., & Branton, D. (2016). Three decades of nanopore sequencing. Nature Biotechnology, 34, 518. https:\/\/doi.org\/10.1038\/nbt.3423","journal-title":"Nature Biotechnology"},{"key":"7001_CR4","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0257521","author":"C Delahayeid","year":"2021","unstructured":"Delahayeid, C., & Nicolas, J. (2021). Sequencing DNA with nanopores: Troubles and biases. PLoS ONE. https:\/\/doi.org\/10.1371\/journal.pone.0257521","journal-title":"PLoS ONE"},{"key":"7001_CR5","unstructured":"Dinh, L., Sohl-Dickstein, J., & Bengio, S. (2017). Density estimation using real NVP. In International conference on learning representations."},{"key":"7001_CR6","unstructured":"Doersch, C. (2016). Tutorial on variational autoencoders."},{"key":"7001_CR7","unstructured":"Durkan, C., Bekasov, A., Murray, I., & Papamakarios, G. (2019). Neural spline flows. Advances in Neural Information Processing Systems, 32."},{"key":"7001_CR8","unstructured":"Grathwohl, W., Chen, R. T., Bettencourt, J., Sutskever, I., & Duvenaud, D. (2018). FFJORD: Free-form continuous dynamics for scalable reversible generative models. arXiv preprint arXiv:1810.01367"},{"key":"7001_CR9","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 770\u2013778).","DOI":"10.1109\/CVPR.2016.90"},{"key":"7001_CR10","unstructured":"Higgins, I., Matthey, L., Pal, A., Burgess, C., Glorot, X., Botvinick, M., Mohamed, S., & Lerchner, A. (2016). beta-VAE: Learning basic visual concepts with a constrained variational framework. In International conference on learning representations."},{"key":"7001_CR11","first-page":"322","volume":"121","author":"M Ilse","year":"2020","unstructured":"Ilse, M., Tomczak, J. M., Louizos, C., Welling, M., & Nl, M. W. (2020). DIVA: Domain invariant variational autoencoders. Proceedings of Machine Learning Research, 121, 322\u2013348.","journal-title":"Proceedings of Machine Learning Research"},{"key":"7001_CR12","unstructured":"Kingma, D. P., & Ba, J. L. (2014). ADAM: A method for stochastic optimization. In 3rd international conference on learning representations, ICLR 2015\u2014conference track proceedings."},{"key":"7001_CR13","unstructured":"Kingma, D. P., & Welling, M. (2013). Auto-encoding variational Bayes. In 2nd international conference on learning representations, ICLR 2014\u2014conference track proceedings."},{"issue":"17","key":"7001_CR14","doi-asserted-by":"publisher","first-page":"2899","DOI":"10.1093\/BIOINFORMATICS\/BTY223","volume":"34","author":"Y Li","year":"2018","unstructured":"Li, Y., Han, R., Bi, C., Li, M., Wang, S., & Gao, X. (2018). DeepSimulator: A deep simulator for nanopore sequencing. Bioinformatics, 34(17), 2899\u20132908. https:\/\/doi.org\/10.1093\/BIOINFORMATICS\/BTY223","journal-title":"Bioinformatics"},{"issue":"7","key":"7001_CR15","doi-asserted-by":"publisher","first-page":"214","DOI":"10.3390\/bios11070214","volume":"11","author":"B Lin","year":"2021","unstructured":"Lin, B., Hui, J., & Mao, H. (2021). Nanopore technology and its applications in gene sequencing. Biosensors, 11(7), 214. https:\/\/doi.org\/10.3390\/bios11070214","journal-title":"Biosensors"},{"issue":"8","key":"7001_CR16","doi-asserted-by":"publisher","first-page":"2578","DOI":"10.1093\/BIOINFORMATICS\/BTZ963","volume":"36","author":"Y Li","year":"2020","unstructured":"Li, Y., Wang, S., Wang, S., Bi, C., Qiu, Z., Li, M., & Gao, X. (2020). DeepSimulator1.5: A more powerful, quicker and lighter simulator for Nanopore sequencing. Bioinformatics, 36(8), 2578\u20132580. https:\/\/doi.org\/10.1093\/BIOINFORMATICS\/BTZ963","journal-title":"Bioinformatics"},{"key":"7001_CR17","doi-asserted-by":"crossref","unstructured":"Niederle, J., Koop, S., Pag\u00e8s-Gallego, M., & Menkovski, V. (2024). VADA: A data-driven simulator for nanopore sequencing. In International conference on discovery science (pp. 198\u2013210). Springer.","DOI":"10.1007\/978-3-031-78977-9_13"},{"issue":"3","key":"7001_CR18","doi-asserted-by":"publisher","first-page":"246","DOI":"10.1080\/15384047.2016.1139236","volume":"17","author":"AL Norris","year":"2016","unstructured":"Norris, A. L., Workman, R. E., Fan, Y., Eshleman, J. R., & Timp, W. (2016). Nanopore sequencing detects structural variants in cancer. Cancer Biology and Therapy, 17(3), 246\u2013253. https:\/\/doi.org\/10.1080\/15384047.2016.1139236","journal-title":"Cancer Biology and Therapy"},{"issue":"1","key":"7001_CR19","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/S13059-023-02903-2\/FIGURES\/4","volume":"24","author":"M Pag\u00e8s-Gallego","year":"2023","unstructured":"Pag\u00e8s-Gallego, M., & Ridder, J. (2023). Comprehensive benchmark and architectural analysis of deep learning models for nanopore sequencing basecalling. Genome Biology, 24(1), 1\u201318. https:\/\/doi.org\/10.1186\/S13059-023-02903-2\/FIGURES\/4","journal-title":"Genome Biology"},{"issue":"57","key":"7001_CR20","first-page":"1","volume":"22","author":"G Papamakarios","year":"2021","unstructured":"Papamakarios, G., Nalisnick, E., Rezende, D. J., Mohamed, S., & Lakshminarayanan, B. (2021). Normalizing flows for probabilistic modeling and inference. Journal of Machine Learning Research, 22(57), 1\u201364.","journal-title":"Journal of Machine Learning Research"},{"issue":"1","key":"7001_CR21","doi-asserted-by":"publisher","first-page":"90","DOI":"10.1186\/s13059-018-1462-9","volume":"19","author":"FJ Rang","year":"2018","unstructured":"Rang, F. J., Kloosterman, W. P., & Ridder, J. (2018). From squiggle to basepair: Computational approaches for improving nanopore sequencing read accuracy. Genome Biology, 19(1), 90.","journal-title":"Genome Biology"},{"key":"7001_CR22","unstructured":"Rezende, D., & Mohamed, S. (2015). Variational inference with normalizing flows. In International conference on machine learning (pp. 1530\u20131538). PMLR."},{"key":"7001_CR23","doi-asserted-by":"publisher","unstructured":"Rohrandt, C., Kraft, N., Gie\u00dfelmann, P., Br\u00e4ndl, B., Schuldt, B. M., Jetzek, U., & M\u00fcller, F. J. (2019). Nanopore SimulatION: A raw data simulator for nanopore sequencing. In Proceedings\u20142018 IEEE international conference on bioinformatics and biomedicine, BIBM 2018 (pp. 1536\u20131543). https:\/\/doi.org\/10.1109\/BIBM.2018.8621253","DOI":"10.1109\/BIBM.2018.8621253"},{"key":"7001_CR24","unstructured":"Tschannen, M., Zurich, E., Google, O. B., Team, B., Lucic, M., & Ai, G. (2018). Recent advances in autoencoder-based representation learning."},{"issue":"11","key":"7001_CR25","doi-asserted-by":"publisher","first-page":"1348","DOI":"10.1038\/s41587-021-01108-x","volume":"39","author":"Y Wang","year":"2021","unstructured":"Wang, Y., Zhao, Y., Bollas, A., Wang, Y., & Au, K. F. (2021). Nanopore sequencing technology, bioinformatics and applications. Nature Biotechnology, 39(11), 1348\u20131365. https:\/\/doi.org\/10.1038\/s41587-021-01108-x","journal-title":"Nature Biotechnology"}],"container-title":["Machine Learning"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10994-026-07001-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10994-026-07001-5","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10994-026-07001-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,9]],"date-time":"2026-03-09T10:13:37Z","timestamp":1773051217000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10994-026-07001-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,3]]},"references-count":25,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2026,3]]}},"alternative-id":["7001"],"URL":"https:\/\/doi.org\/10.1007\/s10994-026-07001-5","relation":{},"ISSN":["0885-6125","1573-0565"],"issn-type":[{"value":"0885-6125","type":"print"},{"value":"1573-0565","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,3]]},"assertion":[{"value":"3 April 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 August 2025","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"26 January 2026","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 March 2026","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors have no competing interests to declare that are relevant to the content of this article.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethical Approval and Consent to Participate"}},{"value":"Not applicable.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for Publication"}}],"article-number":"66"}}