Cargando…
GCAT|Panel, a comprehensive structural variant haplotype map of the Iberian population from high-coverage whole-genome sequencing
The combined analysis of haplotype panels with phenotype clinical cohorts is a common approach to explore the genetic architecture of human diseases. However, genetic studies are mainly based on single nucleotide variants (SNVs) and small insertions and deletions (indels). Here, we contribute to fil...
Autores principales: | , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8934637/ https://www.ncbi.nlm.nih.gov/pubmed/35176773 http://dx.doi.org/10.1093/nar/gkac076 |
_version_ | 1784671888253911040 |
---|---|
author | Valls-Margarit, Jordi Galván-Femenía, Iván Matías-Sánchez, Daniel Blay, Natalia Puiggròs, Montserrat Carreras, Anna Salvoro, Cecilia Cortés, Beatriz Amela, Ramon Farre, Xavier Lerga-Jaso, Jon Puig, Marta Sánchez-Herrero, Jose Francisco Moreno, Victor Perucho, Manuel Sumoy, Lauro Armengol, Lluís Delaneau, Olivier Cáceres, Mario de Cid, Rafael Torrents, David |
author_facet | Valls-Margarit, Jordi Galván-Femenía, Iván Matías-Sánchez, Daniel Blay, Natalia Puiggròs, Montserrat Carreras, Anna Salvoro, Cecilia Cortés, Beatriz Amela, Ramon Farre, Xavier Lerga-Jaso, Jon Puig, Marta Sánchez-Herrero, Jose Francisco Moreno, Victor Perucho, Manuel Sumoy, Lauro Armengol, Lluís Delaneau, Olivier Cáceres, Mario de Cid, Rafael Torrents, David |
author_sort | Valls-Margarit, Jordi |
collection | PubMed |
description | The combined analysis of haplotype panels with phenotype clinical cohorts is a common approach to explore the genetic architecture of human diseases. However, genetic studies are mainly based on single nucleotide variants (SNVs) and small insertions and deletions (indels). Here, we contribute to fill this gap by generating a dense haplotype map focused on the identification, characterization, and phasing of structural variants (SVs). By integrating multiple variant identification methods and Logistic Regression Models (LRMs), we present a catalogue of 35 431 441 variants, including 89 178 SVs (≥50 bp), 30 325 064 SNVs and 5 017 199 indels, across 785 Illumina high coverage (30x) whole-genomes from the Iberian GCAT Cohort, containing a median of 3.52M SNVs, 606 336 indels and 6393 SVs per individual. The haplotype panel is able to impute up to 14 360 728 SNVs/indels and 23 179 SVs, showing a 2.7-fold increase for SVs compared with available genetic variation panels. The value of this panel for SVs analysis is shown through an imputed rare Alu element located in a new locus associated with Mononeuritis of lower limb, a rare neuromuscular disease. This study represents the first deep characterization of genetic variation within the Iberian population and the first operational haplotype panel to systematically include the SVs into genome-wide genetic studies. |
format | Online Article Text |
id | pubmed-8934637 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-89346372022-03-21 GCAT|Panel, a comprehensive structural variant haplotype map of the Iberian population from high-coverage whole-genome sequencing Valls-Margarit, Jordi Galván-Femenía, Iván Matías-Sánchez, Daniel Blay, Natalia Puiggròs, Montserrat Carreras, Anna Salvoro, Cecilia Cortés, Beatriz Amela, Ramon Farre, Xavier Lerga-Jaso, Jon Puig, Marta Sánchez-Herrero, Jose Francisco Moreno, Victor Perucho, Manuel Sumoy, Lauro Armengol, Lluís Delaneau, Olivier Cáceres, Mario de Cid, Rafael Torrents, David Nucleic Acids Res Computational Biology The combined analysis of haplotype panels with phenotype clinical cohorts is a common approach to explore the genetic architecture of human diseases. However, genetic studies are mainly based on single nucleotide variants (SNVs) and small insertions and deletions (indels). Here, we contribute to fill this gap by generating a dense haplotype map focused on the identification, characterization, and phasing of structural variants (SVs). By integrating multiple variant identification methods and Logistic Regression Models (LRMs), we present a catalogue of 35 431 441 variants, including 89 178 SVs (≥50 bp), 30 325 064 SNVs and 5 017 199 indels, across 785 Illumina high coverage (30x) whole-genomes from the Iberian GCAT Cohort, containing a median of 3.52M SNVs, 606 336 indels and 6393 SVs per individual. The haplotype panel is able to impute up to 14 360 728 SNVs/indels and 23 179 SVs, showing a 2.7-fold increase for SVs compared with available genetic variation panels. The value of this panel for SVs analysis is shown through an imputed rare Alu element located in a new locus associated with Mononeuritis of lower limb, a rare neuromuscular disease. This study represents the first deep characterization of genetic variation within the Iberian population and the first operational haplotype panel to systematically include the SVs into genome-wide genetic studies. Oxford University Press 2022-02-18 /pmc/articles/PMC8934637/ /pubmed/35176773 http://dx.doi.org/10.1093/nar/gkac076 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of Nucleic Acids Research. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Computational Biology Valls-Margarit, Jordi Galván-Femenía, Iván Matías-Sánchez, Daniel Blay, Natalia Puiggròs, Montserrat Carreras, Anna Salvoro, Cecilia Cortés, Beatriz Amela, Ramon Farre, Xavier Lerga-Jaso, Jon Puig, Marta Sánchez-Herrero, Jose Francisco Moreno, Victor Perucho, Manuel Sumoy, Lauro Armengol, Lluís Delaneau, Olivier Cáceres, Mario de Cid, Rafael Torrents, David GCAT|Panel, a comprehensive structural variant haplotype map of the Iberian population from high-coverage whole-genome sequencing |
title | GCAT|Panel, a comprehensive structural variant haplotype map of the Iberian population from high-coverage whole-genome sequencing |
title_full | GCAT|Panel, a comprehensive structural variant haplotype map of the Iberian population from high-coverage whole-genome sequencing |
title_fullStr | GCAT|Panel, a comprehensive structural variant haplotype map of the Iberian population from high-coverage whole-genome sequencing |
title_full_unstemmed | GCAT|Panel, a comprehensive structural variant haplotype map of the Iberian population from high-coverage whole-genome sequencing |
title_short | GCAT|Panel, a comprehensive structural variant haplotype map of the Iberian population from high-coverage whole-genome sequencing |
title_sort | gcat|panel, a comprehensive structural variant haplotype map of the iberian population from high-coverage whole-genome sequencing |
topic | Computational Biology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8934637/ https://www.ncbi.nlm.nih.gov/pubmed/35176773 http://dx.doi.org/10.1093/nar/gkac076 |
work_keys_str_mv | AT vallsmargaritjordi gcatpanelacomprehensivestructuralvarianthaplotypemapoftheiberianpopulationfromhighcoveragewholegenomesequencing AT galvanfemeniaivan gcatpanelacomprehensivestructuralvarianthaplotypemapoftheiberianpopulationfromhighcoveragewholegenomesequencing AT matiassanchezdaniel gcatpanelacomprehensivestructuralvarianthaplotypemapoftheiberianpopulationfromhighcoveragewholegenomesequencing AT blaynatalia gcatpanelacomprehensivestructuralvarianthaplotypemapoftheiberianpopulationfromhighcoveragewholegenomesequencing AT puiggrosmontserrat gcatpanelacomprehensivestructuralvarianthaplotypemapoftheiberianpopulationfromhighcoveragewholegenomesequencing AT carrerasanna gcatpanelacomprehensivestructuralvarianthaplotypemapoftheiberianpopulationfromhighcoveragewholegenomesequencing AT salvorocecilia gcatpanelacomprehensivestructuralvarianthaplotypemapoftheiberianpopulationfromhighcoveragewholegenomesequencing AT cortesbeatriz gcatpanelacomprehensivestructuralvarianthaplotypemapoftheiberianpopulationfromhighcoveragewholegenomesequencing AT amelaramon gcatpanelacomprehensivestructuralvarianthaplotypemapoftheiberianpopulationfromhighcoveragewholegenomesequencing AT farrexavier gcatpanelacomprehensivestructuralvarianthaplotypemapoftheiberianpopulationfromhighcoveragewholegenomesequencing AT lergajasojon gcatpanelacomprehensivestructuralvarianthaplotypemapoftheiberianpopulationfromhighcoveragewholegenomesequencing AT puigmarta gcatpanelacomprehensivestructuralvarianthaplotypemapoftheiberianpopulationfromhighcoveragewholegenomesequencing AT sanchezherrerojosefrancisco gcatpanelacomprehensivestructuralvarianthaplotypemapoftheiberianpopulationfromhighcoveragewholegenomesequencing AT morenovictor gcatpanelacomprehensivestructuralvarianthaplotypemapoftheiberianpopulationfromhighcoveragewholegenomesequencing AT peruchomanuel gcatpanelacomprehensivestructuralvarianthaplotypemapoftheiberianpopulationfromhighcoveragewholegenomesequencing AT sumoylauro gcatpanelacomprehensivestructuralvarianthaplotypemapoftheiberianpopulationfromhighcoveragewholegenomesequencing AT armengollluis gcatpanelacomprehensivestructuralvarianthaplotypemapoftheiberianpopulationfromhighcoveragewholegenomesequencing AT delaneauolivier gcatpanelacomprehensivestructuralvarianthaplotypemapoftheiberianpopulationfromhighcoveragewholegenomesequencing AT caceresmario gcatpanelacomprehensivestructuralvarianthaplotypemapoftheiberianpopulationfromhighcoveragewholegenomesequencing AT decidrafael gcatpanelacomprehensivestructuralvarianthaplotypemapoftheiberianpopulationfromhighcoveragewholegenomesequencing AT torrentsdavid gcatpanelacomprehensivestructuralvarianthaplotypemapoftheiberianpopulationfromhighcoveragewholegenomesequencing |