{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,21]],"date-time":"2025-08-21T18:11:48Z","timestamp":1755799908951,"version":"3.44.0"},"reference-count":50,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2025,8,20]],"date-time":"2025-08-20T00:00:00Z","timestamp":1755648000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100002322","name":"Coordena\u00e7\u00e3o de Aperfei\u00e7oamento de Pessoal de N\u00edvel Superior","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100002322","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Bioinform."],"abstract":"<jats:p>Genome assembly remains an unsolved problem, and de novo strategies (i.e., those run without a reference) are relevant but computationally complex tasks in genomics. Although de novo assemblers have been previously successfully applied in genomic projects, there is still no \u201cbest assembler\u201d, and the choice and setup of assemblers still rely on bioinformatics experts. Thus, as with other computationally complex problems, machine learning has emerged as an alternative (or complementary) way to develop accurate, fast and autonomous assemblers. Reinforcement learning has proven promising for solving complex activities without supervision, such as games, and there is a pressing need to understand the limits of this approach to \u201creal-life\u201d problems, such as the DNA fragment assembly problem. In this study, we analyze the boundaries of applying machine learning via reinforcement learning (RL) for genome assembly. We expand upon the previous approach found in the literature to solve this problem by carefully exploring the learning aspects of the proposed intelligent agent, which uses the Q-learning algorithm. We improved the reward system and optimized the exploration of the state space based on pruning and in collaboration with evolutionary computing (&amp;gt;300% improvement). We tested the new approaches on 23 environments. Our results suggest the unsatisfactory performance of the approaches, both in terms of assembly quality and execution time, providing strong evidence for the poor scalability of the studied reinforcement learning approaches to the genome assembly problem. Finally, we discuss the existing proposal, complemented by attempts at improvement that also proved insufficient. In doing so, we contribute to the scientific community by offering a clear mapping of the limitations and challenges that should be taken into account in future attempts to apply reinforcement learning to genome assembly.<\/jats:p>","DOI":"10.3389\/fbinf.2025.1633623","type":"journal-article","created":{"date-parts":[[2025,8,20]],"date-time":"2025-08-20T05:34:43Z","timestamp":1755668083000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Using reinforcement learning in genome assembly: in-depth analysis of a Q-learning assembler"],"prefix":"10.3389","volume":"5","author":[{"given":"Kleber","family":"Padovani","sequence":"first","affiliation":[]},{"given":"Rafael Cabral","family":"Borges","sequence":"additional","affiliation":[]},{"given":"Roberto","family":"Xavier","sequence":"additional","affiliation":[]},{"given":"Andr\u00e9 Carlos","family":"Carvalho","sequence":"additional","affiliation":[]},{"given":"Anna","family":"Reali","sequence":"additional","affiliation":[]},{"given":"Annie","family":"Chateau","sequence":"additional","affiliation":[]},{"given":"Ronnie","family":"Alves","sequence":"additional","affiliation":[]}],"member":"1965","published-online":{"date-parts":[[2025,8,20]]},"reference":[{"key":"B1","first-page":"38","article-title":"Removing the genetics from the standard genetic algorithm","volume-title":"Proceedings of ICML\u201995","author":"Baluja","year":"1995"},{"key":"B2","first-page":"17","article-title":"Intrinsic motivation and reinforcement learning","volume-title":"Intrinsically motivated learning in natural and artificial systems","author":"Barto","year":"2012"},{"key":"B3","doi-asserted-by":"crossref","DOI":"10.1109\/SYNASC.2011.9","article-title":"A reinforcement learning approach for solving the fragment assembly problem","author":"Bocicor","year":""},{"key":"B4","doi-asserted-by":"publisher","DOI":"10.24846\/v20i3y201103","article-title":"A distributed Q-learning approach to fragment assembly","volume":"20","author":"Bocicor","year":"","journal-title":"ICI Buchar."},{"key":"B5","doi-asserted-by":"publisher","first-page":"408","DOI":"10.1016\/j.tics.2019.02.006","article-title":"Reinforcement learning, fast and slow","volume":"23","author":"Botvinick","year":"2019","journal-title":"Trends Cognitive Sci."},{"key":"B6","doi-asserted-by":"publisher","first-page":"10","DOI":"10.1186\/2047-217x-2-10","article-title":"Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species","volume":"2","author":"Bradnam","year":"2013","journal-title":"GigaScience"},{"year":"2016","author":"Brockman","key":"B7"},{"key":"B8","first-page":"211","volume-title":"Pushing the limits","author":"Cook","year":"2012"},{"volume-title":"Introduction to algorithms","year":"2009","author":"Cormen","key":"B9"},{"key":"B10","doi-asserted-by":"publisher","first-page":"824","DOI":"10.1007\/s42452-020-2560-3","article-title":"Reinforcement learning applied to games","volume":"2","author":"Crespo","year":"2020","journal-title":"SN Appl. Sci."},{"key":"B11","article-title":"Challenges of real-world reinforcement learning","volume-title":"ICML 2019 workshop on reinforcement learning for real life (RLRL)","author":"Dulac-Arnold","year":"2019"},{"key":"B12","doi-asserted-by":"publisher","first-page":"014133","DOI":"10.1103\/PhysRevE.109.014133","article-title":"Phase transition in the computational complexity of the shortest common superstring and genome assembly","volume":"109","author":"Fernandez","year":"2024","journal-title":"Phys. Rev. E"},{"key":"B14","first-page":"476","article-title":"Epsilon-bmc: a bayesian ensemble approach to epsilon-greedy exploration in model-free reinforcement learning","volume-title":"Proceedings of machine learning research","author":"Gimelfarb","year":"2020"},{"key":"B15","unstructured":"Introduction to probability\n          \n          \n            \n              Grinstead\n              C. M.\n            \n            \n              Snell\n              J. L.\n            \n          \n          \n          2012"},{"key":"B16","doi-asserted-by":"publisher","first-page":"1291","DOI":"10.1109\/TSMCC.2012.2218595","article-title":"A survey of actor-critic reinforcement learning: standard and natural policy gradients","volume":"42","author":"Grondman","year":"2012","journal-title":"IEEE Trans. Syst. Man, Cybern. Part C Appl. Rev."},{"key":"B17","doi-asserted-by":"publisher","first-page":"1072","DOI":"10.1093\/bioinformatics\/btt086","article-title":"QUAST: quality assessment tool for genome assemblies","volume":"29","author":"Gurevich","year":"2013","journal-title":"Bioinformatics"},{"key":"B18","doi-asserted-by":"publisher","first-page":"647","DOI":"10.1038\/s41586-025-08744-2","article-title":"Mastering diverse control tasks through world models","volume":"640","author":"Hafner","year":"2025","journal-title":"Nature"},{"key":"B19","doi-asserted-by":"publisher","first-page":"357","DOI":"10.1038\/s41586-020-2649-2","article-title":"Array programming with NumPy","volume":"585","author":"Harris","year":"2020","journal-title":"Nature"},{"key":"B20","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/j.ygeno.2015.11.003","article-title":"The sequence of sequencers: the history of sequencing DNA","volume":"107","author":"Heather","year":"2016","journal-title":"Genomics"},{"key":"B21","doi-asserted-by":"publisher","first-page":"241","DOI":"10.22037\/ghfbb.v17i3.2977","article-title":"Artificial intelligence and bioinformatics: a journey from traditional techniques to smart approaches","volume":"17","author":"Jamialahmadi","year":"2024","journal-title":"Gastroenterology Hepatology Bed Bench"},{"key":"B22","doi-asserted-by":"publisher","first-page":"14306","DOI":"10.1038\/ncomms14306","article-title":"MetaSort untangles metagenome assembly by reducing microbial community complexity","volume":"8","author":"Ji","year":"2017","journal-title":"Nat. Commun."},{"key":"B23","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2302.13268","article-title":"Revolutionizing genomics with reinforcement learning techniques","author":"Karami","year":"2023","journal-title":"arXiv"},{"key":"B24","first-page":"323","volume-title":"Evolutionary computing algorithms","author":"Konar","year":"2005"},{"key":"B25","first-page":"12","article-title":"1.1 deep learning hardware: past, present, and future","author":"LeCun","year":"2019"},{"key":"B26","doi-asserted-by":"publisher","first-page":"25","DOI":"10.1093\/bfgp\/elr035","article-title":"Comparison of the two major classes of assembly algorithms: overlap\u2013layout\u2013consensus and de-bruijn-graph","volume":"11","author":"Li","year":"2011","journal-title":"Briefings Funct. Genomics"},{"key":"B27","first-page":"289","article-title":"Computability of models for sequence assembly","author":"Medvedev","year":"2007"},{"key":"B28","doi-asserted-by":"publisher","first-page":"529","DOI":"10.1038\/nature14236","article-title":"Human-level control through deep reinforcement learning","volume":"518","author":"Mnih","year":"2015","journal-title":"Nature"},{"key":"B29","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-031-79167-3","volume-title":"Applying reinforcement learning on real-world data with practical examples in Python","author":"Osborne","year":"2022"},{"volume-title":"Using reinforcement learning in genome assembly: in-depth analysis of a Q-learning assembler","year":"2020","author":"Padovani","key":"B30"},{"key":"B31","doi-asserted-by":"publisher","DOI":"10.1101\/671362","article-title":"A way around the exploration-exploitation dilemma","author":"Peterson","year":"2019","journal-title":"bioRxiv"},{"key":"B32","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/978-3-642-11814-2_1","article-title":"Abstraction and generalization in reinforcement learning: a summary and framework","volume-title":"Adaptive and learning agents","author":"Ponsen","year":"2010"},{"key":"B33","doi-asserted-by":"publisher","first-page":"1353","DOI":"10.1534\/genetics.116.196956","article-title":"The evolving definition of the term \u201cgene\u201d","volume":"205","author":"Portin","year":"2017","journal-title":"Genetics"},{"volume-title":"Algorithms illuminated (Part 4): algorithms for NP-hard problems","year":"2020","author":"Roughgarden","key":"B35"},{"key":"B37","doi-asserted-by":"publisher","first-page":"120495","DOI":"10.1016\/j.eswa.2023.120495","article-title":"Reinforcement learning algorithms: a brief survey","volume":"231","author":"Shakya","year":"2023","journal-title":"Expert Syst. Appl."},{"key":"B38","doi-asserted-by":"publisher","first-page":"87","DOI":"10.1186\/s12859-021-03997-w","article-title":"geneRFinder: gene finding in distinct metagenomic data complexities","volume":"22","author":"Silva","year":"2021","journal-title":"BMC Bioinforma."},{"key":"B40","doi-asserted-by":"publisher","first-page":"195","DOI":"10.1016\/0022-2836(81)90087-5","article-title":"Identification of common molecular subsequences","volume":"147","author":"Smith","year":"1981","journal-title":"J. Mol. Biol."},{"key":"B41","doi-asserted-by":"publisher","first-page":"2116","DOI":"10.1093\/bib\/bby072","article-title":"Machine learning meets genome assembly","volume":"20","author":"Souza","year":"2018","journal-title":"Brief. Bioinforma."},{"volume-title":"Reinforcement learning: an introduction","year":"2018","author":"Sutton","key":"B42"},{"key":"B43","doi-asserted-by":"publisher","first-page":"1633","DOI":"10.5555\/1577069.1755839","article-title":"Transfer learning for reinforcement learning domains: a survey","volume":"10","author":"Taylor","year":"2009","journal-title":"J. Mach. Learn. Res."},{"key":"B44","first-page":"10376","article-title":"Keeping your distance: solving sparse reward tasks using self-balancing shaped rewards","volume-title":"Advances in neural information processing systems 32: annual conference on neural information processing systems 2019","author":"Trott","year":"2019"},{"key":"B46","doi-asserted-by":"publisher","first-page":"eads8932","DOI":"10.1126\/sciadv.ads8932","article-title":"Discovery of antimicrobial peptides with notable antibacterial potency by an LLM-based foundation model","volume":"11","author":"Wang","year":"","journal-title":"Sci. Adv."},{"key":"B47","doi-asserted-by":"publisher","first-page":"637","DOI":"10.1039\/D4SC06864E","article-title":"3DSMILES-GPT: 3D molecular pocket-based generation with token-only large language model","volume":"16","author":"Wang","year":"","journal-title":"Chem. Sci."},{"key":"B48","doi-asserted-by":"publisher","first-page":"135","DOI":"10.1186\/s40168-020-00910-0","article-title":"Microbial dark matter filling the niche in hypersaline microbial mats","volume":"8","author":"Wong","year":"2020","journal-title":"Microbiome"},{"key":"B49","doi-asserted-by":"crossref","first-page":"16","DOI":"10.1007\/978-3-030-46417-2_2","article-title":"Genome assembly using reinforcement learning","volume-title":"Advances in bioinformatics and computational biology","author":"Xavier","year":"2020"},{"key":"B50","doi-asserted-by":"crossref","first-page":"105","DOI":"10.1017\/9781139061773.010","article-title":"Transfer learning in reinforcement learning","volume-title":"Transfer learning","author":"Yang","year":"2020"},{"key":"B51","doi-asserted-by":"publisher","first-page":"898","DOI":"10.14569\/ijacsa.2023.0140798","article-title":"A review on machine-learning and nature-inspired algorithms for genome assembly","volume":"14","author":"Yassine","year":"2023","journal-title":"Int. J. Adv. Comput. Sci. Appl."},{"key":"B52","doi-asserted-by":"publisher","first-page":"5739","DOI":"10.24963\/ijcai.2018\/820","article-title":"Towards sample efficient reinforcement learning","volume":"18","author":"Yu","year":"2018","journal-title":"Proc. 27th Int. Jt. Conf. Artif. Intell."},{"key":"B53","doi-asserted-by":"publisher","first-page":"bbac384","DOI":"10.1093\/bib\/bbac384","article-title":"A geometric deep learning framework for drug repositioning over heterogeneous information networks","volume":"23","author":"Zhao","year":"2022","journal-title":"Briefings Bioinforma."},{"key":"B54","doi-asserted-by":"publisher","first-page":"121360","DOI":"10.1016\/j.ins.2024.121360","article-title":"Regulation-aware graph learning for drug repositioning over heterogeneous biological network","volume":"686","author":"Zhao","year":"2025","journal-title":"Inf. Sci."},{"key":"B55","doi-asserted-by":"publisher","first-page":"13344","DOI":"10.1109\/TPAMI.2023.3292075","article-title":"Transfer learning in deep reinforcement learning: a survey","volume":"45","author":"Zhu","year":"2023","journal-title":"IEEE Trans. Pattern Analysis Mach. Intell."}],"container-title":["Frontiers in Bioinformatics"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fbinf.2025.1633623\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,8,20]],"date-time":"2025-08-20T05:34:48Z","timestamp":1755668088000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fbinf.2025.1633623\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,8,20]]},"references-count":50,"alternative-id":["10.3389\/fbinf.2025.1633623"],"URL":"https:\/\/doi.org\/10.3389\/fbinf.2025.1633623","relation":{},"ISSN":["2673-7647"],"issn-type":[{"type":"electronic","value":"2673-7647"}],"subject":[],"published":{"date-parts":[[2025,8,20]]},"article-number":"1633623"}}