{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,31]],"date-time":"2026-01-31T03:13:26Z","timestamp":1769829206523,"version":"3.49.0"},"reference-count":83,"publisher":"Wiley","issue":"4","license":[{"start":{"date-parts":[[2019,12,1]],"date-time":"2019-12-01T00:00:00Z","timestamp":1575158400000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/onlinelibrary.wiley.com\/termsAndConditions#vor"},{"start":{"date-parts":[[2019,12,1]],"date-time":"2019-12-01T00:00:00Z","timestamp":1575158400000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"},{"start":{"date-parts":[[2019,12,1]],"date-time":"2019-12-01T00:00:00Z","timestamp":1575158400000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Quant. Biol."],"published-print":{"date-parts":[[2019,12]]},"abstract":"<jats:sec><jats:title>Background<\/jats:title><jats:p><jats:italic>De novo<\/jats:italic> genome assembly relies on two kinds of graphs: <jats:italic>de Bruijn<\/jats:italic> graphs and overlap graphs. Overlap graphs are the basis for the Celera assembler, while <jats:italic>de Bruijn<\/jats:italic> graphs have become the dominant technical device in the last decade. Those two kinds of graphs are collectively called assembly graphs.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>In this review, we discuss the most recent advances in the problem of constructing, representing and navigating assembly graphs, focusing on very large datasets. We will also explore some computational techniques, such as the Bloom filter, to compactly store graphs while keeping all functionalities intact.<\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusions<\/jats:title><jats:p>We complete our analysis with a discussion on the algorithmic issues of assembling from long reads ( <jats:italic>e.g.,<\/jats:italic> PacBio and Oxford Nanopore). Finally, we present some of the most relevant open problems in this field.<\/jats:p><\/jats:sec>","DOI":"10.1007\/s40484-019-0181-x","type":"journal-article","created":{"date-parts":[[2019,12,16]],"date-time":"2019-12-16T08:16:38Z","timestamp":1576484198000},"page":"278-292","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":47,"title":["Overlap graphs and <i><b>de Bruijn<\/b><\/i> graphs: data structures for <i><b>de novo<\/b><\/i>genome assembly in the big data era"],"prefix":"10.1002","volume":"7","author":[{"given":"Raffaella","family":"Rizzi","sequence":"first","affiliation":[{"name":"Department of Informatics, Systems and Communications University of Milan\u2010Bicocca Milan Viale Sarca 336 20126 Italy"}]},{"given":"Stefano","family":"Beretta","sequence":"additional","affiliation":[{"name":"Department of Informatics, Systems and Communications University of Milan\u2010Bicocca Milan Viale Sarca 336 20126 Italy"}]},{"given":"Murray","family":"Patterson","sequence":"additional","affiliation":[{"name":"Department of Informatics, Systems and Communications University of Milan\u2010Bicocca Milan Viale Sarca 336 20126 Italy"}]},{"given":"Yuri","family":"Pirola","sequence":"additional","affiliation":[{"name":"Department of Informatics, Systems and Communications University of Milan\u2010Bicocca Milan Viale Sarca 336 20126 Italy"}]},{"given":"Marco","family":"Previtali","sequence":"additional","affiliation":[{"name":"Department of Informatics, Systems and Communications University of Milan\u2010Bicocca Milan Viale Sarca 336 20126 Italy"}]},{"given":"Gianluca","family":"Della Vedova","sequence":"additional","affiliation":[{"name":"Department of Informatics, Systems and Communications University of Milan\u2010Bicocca Milan Viale Sarca 336 20126 Italy"}]},{"given":"Paola","family":"Bonizzoni","sequence":"additional","affiliation":[{"name":"Department of Informatics, Systems and Communications University of Milan\u2010Bicocca Milan Viale Sarca 336 20126 Italy"}]}],"member":"311","published-online":{"date-parts":[[2019,12]]},"reference":[{"key":"e_1_2_8_2_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-31957-5_11"},{"key":"e_1_2_8_3_2","volume-title":"A block\u2010sorting lossless data compression algorithm","author":"Burrows M.","year":"1994"},{"key":"e_1_2_8_4_2","doi-asserted-by":"publisher","DOI":"10.1145\/1082036.1082039"},{"key":"e_1_2_8_5_2","doi-asserted-by":"publisher","DOI":"10.1038\/nmeth.1935"},{"key":"e_1_2_8_6_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ygeno.2010.03.001"},{"key":"e_1_2_8_7_2","doi-asserted-by":"publisher","DOI":"10.1093\/bib\/bbq020"},{"key":"e_1_2_8_8_2","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0019175"},{"key":"e_1_2_8_9_2","doi-asserted-by":"publisher","DOI":"10.1093\/bfgp\/elr035"},{"key":"e_1_2_8_10_2","doi-asserted-by":"publisher","DOI":"10.1093\/bib\/bbx147"},{"key":"e_1_2_8_11_2","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pcbi.1006994"},{"key":"e_1_2_8_12_2","doi-asserted-by":"publisher","DOI":"10.1007\/BFb0027775"},{"key":"e_1_2_8_13_2","doi-asserted-by":"publisher","DOI":"10.1137\/0222058"},{"key":"e_1_2_8_14_2","first-page":"31","volume-title":"Bioinformatics and Biomedicine (BIBM \u201909)","author":"Lam T. W.","year":"2009"},{"key":"e_1_2_8_15_2","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btq217"},{"key":"e_1_2_8_16_2","doi-asserted-by":"publisher","DOI":"10.1101\/gr.126953.111"},{"key":"e_1_2_8_17_2","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btp324"},{"key":"e_1_2_8_18_2","doi-asserted-by":"publisher","DOI":"10.1186\/gb\u20102009\u201010\u20103\u2010r25"},{"key":"e_1_2_8_19_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jda.2016.03.003"},{"key":"e_1_2_8_20_2","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btu541"},{"key":"e_1_2_8_21_2","first-page":"26","article-title":"Multithread multistring Burrows\u2010Wheeler transform and longest common prefix array","author":"Bonizzoni P.","year":"2019","journal-title":"J. Comput. Biol."},{"key":"e_1_2_8_22_2","doi-asserted-by":"publisher","DOI":"10.1145\/362686.362692"},{"key":"e_1_2_8_23_2","doi-asserted-by":"publisher","DOI":"10.1109\/90.851975"},{"key":"e_1_2_8_24_2","unstructured":"Broder A. Z.(1997)On the resemblance and containment of documents. InProceedings of Compression and Complexity of SEQUENCES 1997 pp.21\u201329. IEEE"},{"key":"e_1_2_8_25_2","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/bth408"},{"key":"e_1_2_8_26_2","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btu538"},{"key":"e_1_2_8_27_2","doi-asserted-by":"publisher","DOI":"10.1093\/nar\/gkt1069"},{"key":"e_1_2_8_28_2","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/bti1114"},{"key":"e_1_2_8_29_2","doi-asserted-by":"publisher","DOI":"10.1186\/1471\u20102105\u201013\u201082"},{"key":"e_1_2_8_30_2","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.171285098"},{"key":"e_1_2_8_31_2","doi-asserted-by":"publisher","DOI":"10.1089\/cmb.2015.0172"},{"key":"e_1_2_8_32_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-12683-3_28"},{"key":"e_1_2_8_33_2","doi-asserted-by":"crossref","unstructured":"Medvedev P. Pham S. Chaisson M. Tesler G.andPevzner P.(2011)Pairedde Bruijngraphs: A novel approach for incorporating mate pair information into genome assemblers. InProceedings of the 15th Annual International Conference on Research in Computational Molecular Biology RECOMB\u201911 pp.238\u2013251 Springer\u2010Verlag","DOI":"10.1007\/978-3-642-20036-6_22"},{"key":"e_1_2_8_34_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-29627-7_21"},{"key":"e_1_2_8_35_2","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/bts219"},{"key":"e_1_2_8_36_2","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btu291"},{"key":"e_1_2_8_37_2","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/bts280"},{"key":"e_1_2_8_38_2","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2010.188"},{"key":"e_1_2_8_39_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.tcs.2012.02.002"},{"key":"e_1_2_8_40_2","doi-asserted-by":"publisher","DOI":"10.1089\/cmb.2017.0089"},{"key":"e_1_2_8_41_2","volume-title":"Computer and Intractability: A Guide to the Theory of NP\u2010completeness","author":"Garey M. R.","year":"1979"},{"key":"e_1_2_8_42_2","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btq697"},{"key":"e_1_2_8_43_2","doi-asserted-by":"publisher","DOI":"10.1186\/1748\u20107188\u20108\u201022"},{"key":"e_1_2_8_44_2","doi-asserted-by":"publisher","DOI":"10.1186\/1748\u20107188\u20109\u20102"},{"key":"e_1_2_8_45_2","doi-asserted-by":"publisher","DOI":"10.1101\/gr.214346.116"},{"key":"e_1_2_8_46_2","doi-asserted-by":"publisher","DOI":"10.1089\/cmb.2014.0160"},{"key":"e_1_2_8_47_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-33122-0_18"},{"key":"e_1_2_8_48_2","doi-asserted-by":"crossref","unstructured":"Boucher C. Bowe A. Gagie T. Puglisi S. J.andSadakane K.(2015)Variable\u2010orderde Bruijngraphs. InData Compression Conference (DCC) pp.383\u2013392","DOI":"10.1109\/DCC.2015.70"},{"key":"e_1_2_8_49_2","doi-asserted-by":"publisher","DOI":"10.1142\/S0129054118430037"},{"key":"e_1_2_8_50_2","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btx067"},{"key":"e_1_2_8_51_2","doi-asserted-by":"crossref","unstructured":"Almodaresi F. Pandey P.andPatro R.(2017)Rainbowfish: a succinct coloredde Bruijngraph representation. In17th International Workshop on Algorithms in Bioinformatics (WABI 2017).","DOI":"10.1101\/138016"},{"key":"e_1_2_8_52_2","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/bty292"},{"key":"e_1_2_8_53_2","doi-asserted-by":"crossref","unstructured":"Muggli M. D. Alipanahi B. Boucher C.(2019)Building large updatable coloredde Bruijngraphs via merging. bioRxiv 229641","DOI":"10.1093\/bioinformatics\/btz350"},{"key":"e_1_2_8_54_2","volume-title":"An introduction to bioinformatics algorithms","author":"Neil C.","year":"2004"},{"key":"e_1_2_8_55_2","doi-asserted-by":"publisher","DOI":"10.1101\/gr.215087.116"},{"key":"e_1_2_8_56_2","doi-asserted-by":"publisher","DOI":"10.1038\/nbt.3238"},{"key":"e_1_2_8_57_2","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btw811"},{"key":"e_1_2_8_58_2","doi-asserted-by":"crossref","unstructured":"Ruan J.andLi H.(2019)Fast and accurate long\u2010read assembly with wtdbg2. bioRxiv 530972","DOI":"10.1101\/530972"},{"key":"e_1_2_8_59_2","doi-asserted-by":"publisher","DOI":"10.1101\/gr.216465.116"},{"key":"e_1_2_8_60_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-662-44753-6_5"},{"key":"e_1_2_8_61_2","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btn548"},{"key":"e_1_2_8_62_2","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.1604560113"},{"key":"e_1_2_8_63_2","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btu266"},{"key":"e_1_2_8_64_2","doi-asserted-by":"publisher","DOI":"10.1038\/nmeth.4035"},{"key":"e_1_2_8_65_2","doi-asserted-by":"publisher","DOI":"10.1038\/nmeth.2474"},{"key":"e_1_2_8_66_2","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/18.suppl_1.S294"},{"key":"e_1_2_8_67_2","doi-asserted-by":"publisher","DOI":"10.1126\/science.287.5461.2196"},{"key":"e_1_2_8_68_2","doi-asserted-by":"publisher","DOI":"10.1145\/3130348.3130371"},{"key":"e_1_2_8_69_2","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btw152"},{"key":"e_1_2_8_70_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ic.2012.02.002"},{"key":"e_1_2_8_71_2","doi-asserted-by":"crossref","unstructured":"Egidi L. Louza F. A. Manzini G.andTelles G.(2018)External memory BWT and LCP computation for sequence collections with applications. In18th International Workshop on Algorithms in Bioinformatics WABI 2018 Helsinki Finland volume 113 pp.1\u201314","DOI":"10.1186\/s13015-019-0140-0"},{"key":"e_1_2_8_72_2","doi-asserted-by":"publisher","DOI":"10.1186\/s12864\u2010016\u20102829\u20105"},{"key":"e_1_2_8_73_2","doi-asserted-by":"publisher","DOI":"10.1093\/bib\/bbw096"},{"key":"e_1_2_8_74_2","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btu023"},{"key":"e_1_2_8_75_2","doi-asserted-by":"publisher","DOI":"10.1038\/nbt.4227"},{"key":"e_1_2_8_76_2","doi-asserted-by":"publisher","DOI":"10.1038\/ng.1028"},{"key":"e_1_2_8_77_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.cels.2018.05.021"},{"key":"e_1_2_8_78_2","doi-asserted-by":"publisher","DOI":"10.1101\/gr.089532.108"},{"key":"e_1_2_8_79_2","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btu584"},{"key":"e_1_2_8_80_2","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btw609"},{"key":"e_1_2_8_81_2","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btu756"},{"key":"e_1_2_8_82_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-07566-2_10"},{"key":"e_1_2_8_83_2","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btv603"},{"key":"e_1_2_8_84_2","first-page":"118","article-title":"Computational pan\u2010genomics: status, promises and challenges.","volume":"19","author":"Marschall T.","year":"2016","journal-title":"Brief. Bioinform."}],"container-title":["Quantitative Biology"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s40484-019-0181-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s40484-019-0181-x\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s40484-019-0181-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1007\/s40484-019-0181-x","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,1,3]],"date-time":"2025-01-03T09:52:28Z","timestamp":1735897948000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1007\/s40484-019-0181-x"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,12]]},"references-count":83,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2019,12]]}},"alternative-id":["10.1007\/s40484-019-0181-x"],"URL":"https:\/\/doi.org\/10.1007\/s40484-019-0181-x","archive":["Portico"],"relation":{},"ISSN":["2095-4689","2095-4697"],"issn-type":[{"value":"2095-4689","type":"print"},{"value":"2095-4697","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,12]]},"assertion":[{"value":"4 June 2019","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 July 2019","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"26 July 2019","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 December 2019","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"The authors Raffaella Rizzi, Stefano Beretta, Murray Patterson, Yuri Pirola, Marco Previtali, Gianluca Della Vedova and Paola Bonizzoni declare that they have no conflict of interests.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Compliance with Ethics Guidelines"}},{"value":"This article is a review article and does not contain any studies with human or animal subjects performed by any of the authors.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Compliance with Ethics Guidelines"}},{"value":"This content has been made available to all.","name":"free","label":"Free to read"}]}}