{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,21]],"date-time":"2025-11-21T11:31:50Z","timestamp":1763724710227,"version":"3.41.2"},"reference-count":25,"publisher":"Oxford University Press (OUP)","issue":"7","license":[{"start":{"date-parts":[[2024,7,4]],"date-time":"2024-07-04T00:00:00Z","timestamp":1720051200000},"content-version":"vor","delay-in-days":3,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100018929","name":"German Network for Bioinformatics Infrastructure","doi-asserted-by":"publisher","award":["031A533A","031A533B","031A534A","031A535A","031A537A","031A537B","031A537C","031A537D"],"award-info":[{"award-number":["031A533A","031A533B","031A534A","031A535A","031A537A","031A537B","031A537C","031A537D"]}],"id":[{"id":"10.13039\/501100018929","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>The increasing availability of complete genomes demands for models to study genomic variability within entire populations. Pangenome graphs capture the full genomic similarity and diversity between multiple genomes. In order to understand them, we need to see them. For visualization, we need a human-readable graph layout: a graph embedding in low (e.g. two) dimensional depictions. Due to a pangenome graph\u2019s potential excessive size, this is a significant challenge.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>In response, we introduce a novel graph layout algorithm: the Path-Guided Stochastic Gradient Descent (PG-SGD). PG-SGD uses the genomes, represented in the pangenome graph as paths, as an embedded positional system to sample genomic distances between pairs of nodes. This avoids the quadratic cost seen in previous versions of graph drawing by SGD. We show that our implementation efficiently computes the low-dimensional layouts of gigabase-scale pangenome graphs, unveiling their biological features.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>We integrated PG-SGD in ODGI which is released as free software under the MIT open source license. Source code is available at https:\/\/github.com\/pangenome\/odgi.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btae363","type":"journal-article","created":{"date-parts":[[2024,7,2]],"date-time":"2024-07-02T22:11:35Z","timestamp":1719958295000},"source":"Crossref","is-referenced-by-count":7,"title":["Pangenome graph layout by Path-Guided Stochastic Gradient Descent"],"prefix":"10.1093","volume":"40","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3326-817X","authenticated-orcid":false,"given":"Simon","family":"Heumos","sequence":"first","affiliation":[{"name":"Quantitative Biology Center (QBiC), University of T\u00fcbingen , 72076 T\u00fcbingen, Germany"},{"name":"Biomedical Data Science, Department of Computer Science, University of T\u00fcbingen , 72076 T\u00fcbingen, Germany"},{"name":"M3 Research Center, University Hospital T\u00fcbingen , 72076 T\u00fcbingen, Germany"},{"name":"Institute for Bioinformatics and Medical Informatics (IBMI), University of T\u00fcbingen , 72076 T\u00fcbingen, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9744-131X","authenticated-orcid":false,"given":"Andrea","family":"Guarracino","sequence":"additional","affiliation":[{"name":"Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center , Memphis, TN 38163, United States"},{"name":"Genomics Research Centre, Human Technopole , 20157 Milan, Italy"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8566-4049","authenticated-orcid":false,"given":"Jan-Niklas M","family":"Schmelzle","sequence":"additional","affiliation":[{"name":"Department of Computer Engineering, School of Computation, Information and Technology (CIT), Technical University of Munich , 80333 Munich, Germany"},{"name":"School of Electrical and Computer Engineering, Cornell University , Ithaca, NY 14853, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6775-2843","authenticated-orcid":false,"given":"Jiajie","family":"Li","sequence":"additional","affiliation":[{"name":"School of Electrical and Computer Engineering, Cornell University , Ithaca, NY 14853, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0778-0308","authenticated-orcid":false,"given":"Zhiru","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Electrical and Computer Engineering, Cornell University , Ithaca, NY 14853, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7232-3103","authenticated-orcid":false,"given":"J\u00f6rg","family":"Hagmann","sequence":"additional","affiliation":[{"name":"Computomics GmbH , 72072 T\u00fcbingen, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4375-0691","authenticated-orcid":false,"given":"Sven","family":"Nahnsen","sequence":"additional","affiliation":[{"name":"Quantitative Biology Center (QBiC), University of T\u00fcbingen , 72076 T\u00fcbingen, Germany"},{"name":"Biomedical Data Science, Department of Computer Science, University of T\u00fcbingen , 72076 T\u00fcbingen, Germany"},{"name":"M3 Research Center, University Hospital T\u00fcbingen , 72076 T\u00fcbingen, Germany"},{"name":"Institute for Bioinformatics and Medical Informatics (IBMI), University of T\u00fcbingen , 72076 T\u00fcbingen, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8021-9162","authenticated-orcid":false,"given":"Pjotr","family":"Prins","sequence":"additional","affiliation":[{"name":"Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center , Memphis, TN 38163, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3821-631X","authenticated-orcid":false,"given":"Erik","family":"Garrison","sequence":"additional","affiliation":[{"name":"Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center , Memphis, TN 38163, United States"}]}],"member":"286","published-online":{"date-parts":[[2024,7,3]]},"reference":[{"key":"2024070622011583100_btae363-B1","doi-asserted-by":"crossref","first-page":"159","DOI":"10.1186\/s13059-019-1774-4","article-title":"Is it time to change the reference genome?","volume":"20","author":"Ballouz","year":"2019","journal-title":"Genome Biol"},{"key":"2024070622011583100_btae363-B2","first-page":"65","article-title":"Force-directed algorithms for schematic drawings and placement: a survey","volume-title":"Inf Vis","author":"Cheong","year":"2019"},{"key":"2024070622011583100_btae363-B3","first-page":"118","article-title":"Computational pan-genomics: status, promises and challenges","volume":"19","author":"Computational Pan-Genomics Consortium","year":"2018","journal-title":"Brief Bioinform"},{"key":"2024070622011583100_btae363-B4","article-title":"PanPA: generation and alignment of panproteome graphs","volume-title":"Bioinformatics","author":"Dabbaghie","year":"2023"},{"key":"2024070622011583100_btae363-B5","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1146\/annurev-genom-120219-080406","article-title":"Pangenome graphs","volume":"21","author":"Eizenga","year":"2020","journal-title":"Annu Rev Genomics Hum Genet"},{"volume-title":"Graphical pangenomics","year":"2019","author":"Garrison","key":"2024070622011583100_btae363-B6"},{"key":"2024070622011583100_btae363-B7","doi-asserted-by":"crossref","first-page":"875","DOI":"10.1038\/nbt.4227","article-title":"Variation graph toolkit improves read mapping by representing genetic variation in the reference","volume":"36","author":"Garrison","year":"2018","journal-title":"Nat Biotechnol"},{"year":"2023","author":"Garrison","key":"2024070622011583100_btae363-B8"},{"first-page":"326","year":"2014","author":"Gog","key":"2024070622011583100_btae363-B9"},{"key":"2024070622011583100_btae363-B10","doi-asserted-by":"crossref","first-page":"3319","DOI":"10.1093\/bioinformatics\/btac308","article-title":"ODGI: understanding pangenome graphs","volume":"38","author":"Guarracino","year":"2022","journal-title":"Bioinformatics"},{"key":"2024070622011583100_btae363-B11","doi-asserted-by":"crossref","first-page":"335","DOI":"10.1038\/s41586-023-05976-y","article-title":"Recombination between heterologous human acrocentric chromosomes","volume":"617","author":"Guarracino","year":"2023","journal-title":"Nature"},{"year":"2005","author":"Hachul","key":"2024070622011583100_btae363-B12"},{"key":"2024070622011583100_btae363-B13","first-page":"649","article-title":"A new method that simultaneously aligns and reconstructs ancestral sequences for any number of homologous sequences, when the phylogeny is given","volume":"6","author":"Hein","year":"1989","journal-title":"Mol Biol Evol"},{"key":"2024070622011583100_btae363-B14","doi-asserted-by":"crossref","first-page":"312","DOI":"10.1038\/s41586-023-05896-x","article-title":"A draft human pangenome reference","volume":"617","author":"Liao","year":"2023","journal-title":"Nature"},{"key":"2024070622011583100_btae363-B15","doi-asserted-by":"crossref","first-page":"988","DOI":"10.1038\/nature03187","article-title":"The sequence and analysis of duplication-rich human chromosome 16","volume":"432","author":"Martin","year":"2004","journal-title":"Nature"},{"key":"2024070622011583100_btae363-B16","first-page":"44","article-title":"The complete sequence of a human genome","volume-title":"Science","author":"Nurk","year":"2022"},{"volume-title":"Advances in Neural Information Processing Systems","year":"2011","author":"Recht","key":"2024070622011583100_btae363-B17"},{"key":"2024070622011583100_btae363-B18","doi-asserted-by":"crossref","first-page":"849","DOI":"10.1101\/gr.213611.116","article-title":"Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly","volume":"27","author":"Schneider","year":"2017","journal-title":"Genome Res"},{"key":"2024070622011583100_btae363-B19","doi-asserted-by":"crossref","first-page":"243","DOI":"10.1038\/s41576-020-0210-7","article-title":"Pan-genomics in the human genome era","volume":"21","author":"Sherman","year":"2020","journal-title":"Nat Rev Genet"},{"key":"2024070622011583100_btae363-B20","doi-asserted-by":"crossref","first-page":"239","DOI":"10.1038\/s41592-022-01731-9","article-title":"Haplotype-aware pantranscriptome analyses using spliced pangenome graphs","volume":"20","author":"Sibbesen","year":"2023","journal-title":"Nat Methods"},{"key":"2024070622011583100_btae363-B21","doi-asserted-by":"crossref","first-page":"1042550","DOI":"10.3389\/fgene.2022.1042550","article-title":"From the reference human genome to human pangenome: premise, promise and challenge","volume":"13","author":"Singh","year":"2022","journal-title":"Front Genet"},{"key":"2024070622011583100_btae363-B22","doi-asserted-by":"crossref","first-page":"472","DOI":"10.1016\/j.mib.2008.09.006","article-title":"Comparative genomics: the bacterial pan-genome","volume":"11","author":"Tettelin","year":"2008","journal-title":"Curr Opin Microbiol"},{"year":"2014","author":"Wang","key":"2024070622011583100_btae363-B23"},{"key":"2024070622011583100_btae363-B24","doi-asserted-by":"crossref","first-page":"2738","DOI":"10.1109\/TVCG.2018.2859997","article-title":"Graph drawing by stochastic gradient descent","volume":"25","author":"Zheng","year":"2019","journal-title":"IEEE Trans Vis Comput Graph"},{"key":"2024070622011583100_btae363-B25","doi-asserted-by":"crossref","DOI":"10.4159\/harvard.9780674434929","volume-title":"Selected Studies of the Principle of Relative Frequency in Language","author":"Zipf","year":"1932"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btae363\/58425532\/btae363.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/7\/btae363\/58463786\/btae363.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/7\/btae363\/58463786\/btae363.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,6]],"date-time":"2024-07-06T22:04:51Z","timestamp":1720303491000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btae363\/7705520"}},"subtitle":[],"editor":[{"given":"Peter","family":"Robinson","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2024,7,1]]},"references-count":25,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2024,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btae363","relation":{},"ISSN":["1367-4811"],"issn-type":[{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2024,7]]},"published":{"date-parts":[[2024,7,1]]},"article-number":"btae363"}}