Cargando…

Pangenome graph layout by Path-Guided Stochastic Gradient Descent

MOTIVATION: The increasing availability of complete genomes demands for models to study genomic variability within entire populations. Pangenome graphs capture the full genomic similarity and diversity between multiple genomes. In order to understand them, we need to see them. For visualization, we...

Descripción completa

Detalles Bibliográficos
Autores principales: Heumos, Simon, Guarracino, Andrea, Schmelzle, Jan-Niklas M., Li, Jiajie, Zhang, Zhiru, Hagmann, Jörg, Nahnsen, Sven, Prins, Pjotr, Garrison, Erik
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10542513/
https://www.ncbi.nlm.nih.gov/pubmed/37790531
http://dx.doi.org/10.1101/2023.09.22.558964
_version_ 1785114109673472000
author Heumos, Simon
Guarracino, Andrea
Schmelzle, Jan-Niklas M.
Li, Jiajie
Zhang, Zhiru
Hagmann, Jörg
Nahnsen, Sven
Prins, Pjotr
Garrison, Erik
author_facet Heumos, Simon
Guarracino, Andrea
Schmelzle, Jan-Niklas M.
Li, Jiajie
Zhang, Zhiru
Hagmann, Jörg
Nahnsen, Sven
Prins, Pjotr
Garrison, Erik
author_sort Heumos, Simon
collection PubMed
description MOTIVATION: The increasing availability of complete genomes demands for models to study genomic variability within entire populations. Pangenome graphs capture the full genomic similarity and diversity between multiple genomes. In order to understand them, we need to see them. For visualization, we need a human readable graph layout: A graph embedding in low (e.g. two) dimensional depictions. Due to a pangenome graph’s potential excessive size, this is a significant challenge. RESULTS: In response, we introduce a novel graph layout algorithm: the Path-Guided Stochastic Gradient Descent (PG-SGD). PG-SGD uses the genomes, represented in the pangenome graph as paths, as an embedded positional system to sample genomic distances between pairs of nodes. This avoids the quadratic cost seen in previous versions of graph drawing by Stochastic Gradient Descent (SGD). We show that our implementation efficiently computes the low dimensional layouts of gigabase-scale pangenome graphs, unveiling their biological features. AVAILABILITY: We integrated PG-SGD in ODGI which is released as free software under the MIT open source license. Source code is available at https://github.com/pangenome/odgi.
format Online
Article
Text
id pubmed-10542513
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-105425132023-10-03 Pangenome graph layout by Path-Guided Stochastic Gradient Descent Heumos, Simon Guarracino, Andrea Schmelzle, Jan-Niklas M. Li, Jiajie Zhang, Zhiru Hagmann, Jörg Nahnsen, Sven Prins, Pjotr Garrison, Erik bioRxiv Article MOTIVATION: The increasing availability of complete genomes demands for models to study genomic variability within entire populations. Pangenome graphs capture the full genomic similarity and diversity between multiple genomes. In order to understand them, we need to see them. For visualization, we need a human readable graph layout: A graph embedding in low (e.g. two) dimensional depictions. Due to a pangenome graph’s potential excessive size, this is a significant challenge. RESULTS: In response, we introduce a novel graph layout algorithm: the Path-Guided Stochastic Gradient Descent (PG-SGD). PG-SGD uses the genomes, represented in the pangenome graph as paths, as an embedded positional system to sample genomic distances between pairs of nodes. This avoids the quadratic cost seen in previous versions of graph drawing by Stochastic Gradient Descent (SGD). We show that our implementation efficiently computes the low dimensional layouts of gigabase-scale pangenome graphs, unveiling their biological features. AVAILABILITY: We integrated PG-SGD in ODGI which is released as free software under the MIT open source license. Source code is available at https://github.com/pangenome/odgi. Cold Spring Harbor Laboratory 2023-10-17 /pmc/articles/PMC10542513/ /pubmed/37790531 http://dx.doi.org/10.1101/2023.09.22.558964 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use.
spellingShingle Article
Heumos, Simon
Guarracino, Andrea
Schmelzle, Jan-Niklas M.
Li, Jiajie
Zhang, Zhiru
Hagmann, Jörg
Nahnsen, Sven
Prins, Pjotr
Garrison, Erik
Pangenome graph layout by Path-Guided Stochastic Gradient Descent
title Pangenome graph layout by Path-Guided Stochastic Gradient Descent
title_full Pangenome graph layout by Path-Guided Stochastic Gradient Descent
title_fullStr Pangenome graph layout by Path-Guided Stochastic Gradient Descent
title_full_unstemmed Pangenome graph layout by Path-Guided Stochastic Gradient Descent
title_short Pangenome graph layout by Path-Guided Stochastic Gradient Descent
title_sort pangenome graph layout by path-guided stochastic gradient descent
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10542513/
https://www.ncbi.nlm.nih.gov/pubmed/37790531
http://dx.doi.org/10.1101/2023.09.22.558964
work_keys_str_mv AT heumossimon pangenomegraphlayoutbypathguidedstochasticgradientdescent
AT guarracinoandrea pangenomegraphlayoutbypathguidedstochasticgradientdescent
AT schmelzlejanniklasm pangenomegraphlayoutbypathguidedstochasticgradientdescent
AT lijiajie pangenomegraphlayoutbypathguidedstochasticgradientdescent
AT zhangzhiru pangenomegraphlayoutbypathguidedstochasticgradientdescent
AT hagmannjorg pangenomegraphlayoutbypathguidedstochasticgradientdescent
AT nahnsensven pangenomegraphlayoutbypathguidedstochasticgradientdescent
AT prinspjotr pangenomegraphlayoutbypathguidedstochasticgradientdescent
AT garrisonerik pangenomegraphlayoutbypathguidedstochasticgradientdescent