Cargando…

GraphTyper2 enables population-scale genotyping of structural variation using pangenome graphs

Analysis of sequence diversity in the human genome is fundamental for genetic studies. Structural variants (SVs) are frequently omitted in sequence analysis studies, although each has a relatively large impact on the genome. Here, we present GraphTyper2, which uses pangenome graphs to genotype SVs a...

Descripción completa

Detalles Bibliográficos
Autores principales: Eggertsson, Hannes P., Kristmundsdottir, Snaedis, Beyter, Doruk, Jonsson, Hakon, Skuladottir, Astros, Hardarson, Marteinn T., Gudbjartsson, Daniel F., Stefansson, Kari, Halldorsson, Bjarni V., Melsted, Pall
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6881350/
https://www.ncbi.nlm.nih.gov/pubmed/31776332
http://dx.doi.org/10.1038/s41467-019-13341-9
_version_ 1783473928214151168
author Eggertsson, Hannes P.
Kristmundsdottir, Snaedis
Beyter, Doruk
Jonsson, Hakon
Skuladottir, Astros
Hardarson, Marteinn T.
Gudbjartsson, Daniel F.
Stefansson, Kari
Halldorsson, Bjarni V.
Melsted, Pall
author_facet Eggertsson, Hannes P.
Kristmundsdottir, Snaedis
Beyter, Doruk
Jonsson, Hakon
Skuladottir, Astros
Hardarson, Marteinn T.
Gudbjartsson, Daniel F.
Stefansson, Kari
Halldorsson, Bjarni V.
Melsted, Pall
author_sort Eggertsson, Hannes P.
collection PubMed
description Analysis of sequence diversity in the human genome is fundamental for genetic studies. Structural variants (SVs) are frequently omitted in sequence analysis studies, although each has a relatively large impact on the genome. Here, we present GraphTyper2, which uses pangenome graphs to genotype SVs and small variants using short-reads. Comparison to the syndip benchmark dataset shows that our SV genotyping is sensitive and variant segregation in families demonstrates the accuracy of our approach. We demonstrate that incorporating public assembly data into our pipeline greatly improves sensitivity, particularly for large insertions. We validate 6,812 SVs on average per genome using long-read data of 41 Icelanders. We show that GraphTyper2 can simultaneously genotype tens of thousands of whole-genomes by characterizing 60 million small variants and half a million SVs in 49,962 Icelanders, including 80 thousand SVs with high-confidence.
format Online
Article
Text
id pubmed-6881350
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-68813502019-11-29 GraphTyper2 enables population-scale genotyping of structural variation using pangenome graphs Eggertsson, Hannes P. Kristmundsdottir, Snaedis Beyter, Doruk Jonsson, Hakon Skuladottir, Astros Hardarson, Marteinn T. Gudbjartsson, Daniel F. Stefansson, Kari Halldorsson, Bjarni V. Melsted, Pall Nat Commun Article Analysis of sequence diversity in the human genome is fundamental for genetic studies. Structural variants (SVs) are frequently omitted in sequence analysis studies, although each has a relatively large impact on the genome. Here, we present GraphTyper2, which uses pangenome graphs to genotype SVs and small variants using short-reads. Comparison to the syndip benchmark dataset shows that our SV genotyping is sensitive and variant segregation in families demonstrates the accuracy of our approach. We demonstrate that incorporating public assembly data into our pipeline greatly improves sensitivity, particularly for large insertions. We validate 6,812 SVs on average per genome using long-read data of 41 Icelanders. We show that GraphTyper2 can simultaneously genotype tens of thousands of whole-genomes by characterizing 60 million small variants and half a million SVs in 49,962 Icelanders, including 80 thousand SVs with high-confidence. Nature Publishing Group UK 2019-11-27 /pmc/articles/PMC6881350/ /pubmed/31776332 http://dx.doi.org/10.1038/s41467-019-13341-9 Text en © The Author(s) 2019 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Eggertsson, Hannes P.
Kristmundsdottir, Snaedis
Beyter, Doruk
Jonsson, Hakon
Skuladottir, Astros
Hardarson, Marteinn T.
Gudbjartsson, Daniel F.
Stefansson, Kari
Halldorsson, Bjarni V.
Melsted, Pall
GraphTyper2 enables population-scale genotyping of structural variation using pangenome graphs
title GraphTyper2 enables population-scale genotyping of structural variation using pangenome graphs
title_full GraphTyper2 enables population-scale genotyping of structural variation using pangenome graphs
title_fullStr GraphTyper2 enables population-scale genotyping of structural variation using pangenome graphs
title_full_unstemmed GraphTyper2 enables population-scale genotyping of structural variation using pangenome graphs
title_short GraphTyper2 enables population-scale genotyping of structural variation using pangenome graphs
title_sort graphtyper2 enables population-scale genotyping of structural variation using pangenome graphs
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6881350/
https://www.ncbi.nlm.nih.gov/pubmed/31776332
http://dx.doi.org/10.1038/s41467-019-13341-9
work_keys_str_mv AT eggertssonhannesp graphtyper2enablespopulationscalegenotypingofstructuralvariationusingpangenomegraphs
AT kristmundsdottirsnaedis graphtyper2enablespopulationscalegenotypingofstructuralvariationusingpangenomegraphs
AT beyterdoruk graphtyper2enablespopulationscalegenotypingofstructuralvariationusingpangenomegraphs
AT jonssonhakon graphtyper2enablespopulationscalegenotypingofstructuralvariationusingpangenomegraphs
AT skuladottirastros graphtyper2enablespopulationscalegenotypingofstructuralvariationusingpangenomegraphs
AT hardarsonmarteinnt graphtyper2enablespopulationscalegenotypingofstructuralvariationusingpangenomegraphs
AT gudbjartssondanielf graphtyper2enablespopulationscalegenotypingofstructuralvariationusingpangenomegraphs
AT stefanssonkari graphtyper2enablespopulationscalegenotypingofstructuralvariationusingpangenomegraphs
AT halldorssonbjarniv graphtyper2enablespopulationscalegenotypingofstructuralvariationusingpangenomegraphs
AT melstedpall graphtyper2enablespopulationscalegenotypingofstructuralvariationusingpangenomegraphs