Cargando…

Identification and validation of copy number variants using SNP genotyping arrays from a large clinical cohort

BACKGROUND: Genotypes obtained with commercial SNP arrays have been extensively used in many large case-control or population-based cohorts for SNP-based genome-wide association studies for a multitude of traits. Yet, these genotypes capture only a small fraction of the variance of the studied trait...

Descripción completa

Detalles Bibliográficos
Autores principales: Valsesia, Armand, Stevenson, Brian J, Waterworth, Dawn, Mooser, Vincent, Vollenweider, Peter, Waeber, Gérard, Jongeneel, C Victor, Beckmann, Jacques S, Kutalik, Zoltán, Bergmann, Sven
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3464625/
https://www.ncbi.nlm.nih.gov/pubmed/22702538
http://dx.doi.org/10.1186/1471-2164-13-241
_version_ 1782245444266491904
author Valsesia, Armand
Stevenson, Brian J
Waterworth, Dawn
Mooser, Vincent
Vollenweider, Peter
Waeber, Gérard
Jongeneel, C Victor
Beckmann, Jacques S
Kutalik, Zoltán
Bergmann, Sven
author_facet Valsesia, Armand
Stevenson, Brian J
Waterworth, Dawn
Mooser, Vincent
Vollenweider, Peter
Waeber, Gérard
Jongeneel, C Victor
Beckmann, Jacques S
Kutalik, Zoltán
Bergmann, Sven
author_sort Valsesia, Armand
collection PubMed
description BACKGROUND: Genotypes obtained with commercial SNP arrays have been extensively used in many large case-control or population-based cohorts for SNP-based genome-wide association studies for a multitude of traits. Yet, these genotypes capture only a small fraction of the variance of the studied traits. Genomic structural variants (GSV) such as Copy Number Variation (CNV) may account for part of the missing heritability, but their comprehensive detection requires either next-generation arrays or sequencing. Sophisticated algorithms that infer CNVs by combining the intensities from SNP-probes for the two alleles can already be used to extract a partial view of such GSV from existing data sets. RESULTS: Here we present several advances to facilitate the latter approach. First, we introduce a novel CNV detection method based on a Gaussian Mixture Model. Second, we propose a new algorithm, PCA merge, for combining copy-number profiles from many individuals into consensus regions. We applied both our new methods as well as existing ones to data from 5612 individuals from the CoLaus study who were genotyped on Affymetrix 500K arrays. We developed a number of procedures in order to evaluate the performance of the different methods. This includes comparison with previously published CNVs as well as using a replication sample of 239 individuals, genotyped with Illumina 550K arrays. We also established a new evaluation procedure that employs the fact that related individuals are expected to share their CNVs more frequently than randomly selected individuals. The ability to detect both rare and common CNVs provides a valuable resource that will facilitate association studies exploring potential phenotypic associations with CNVs. CONCLUSION: Our new methodologies for CNV detection and their evaluation will help in extracting additional information from the large amount of SNP-genotyping data on various cohorts and use this to explore structural variants and their impact on complex traits.
format Online
Article
Text
id pubmed-3464625
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-34646252012-10-05 Identification and validation of copy number variants using SNP genotyping arrays from a large clinical cohort Valsesia, Armand Stevenson, Brian J Waterworth, Dawn Mooser, Vincent Vollenweider, Peter Waeber, Gérard Jongeneel, C Victor Beckmann, Jacques S Kutalik, Zoltán Bergmann, Sven BMC Genomics Research Article BACKGROUND: Genotypes obtained with commercial SNP arrays have been extensively used in many large case-control or population-based cohorts for SNP-based genome-wide association studies for a multitude of traits. Yet, these genotypes capture only a small fraction of the variance of the studied traits. Genomic structural variants (GSV) such as Copy Number Variation (CNV) may account for part of the missing heritability, but their comprehensive detection requires either next-generation arrays or sequencing. Sophisticated algorithms that infer CNVs by combining the intensities from SNP-probes for the two alleles can already be used to extract a partial view of such GSV from existing data sets. RESULTS: Here we present several advances to facilitate the latter approach. First, we introduce a novel CNV detection method based on a Gaussian Mixture Model. Second, we propose a new algorithm, PCA merge, for combining copy-number profiles from many individuals into consensus regions. We applied both our new methods as well as existing ones to data from 5612 individuals from the CoLaus study who were genotyped on Affymetrix 500K arrays. We developed a number of procedures in order to evaluate the performance of the different methods. This includes comparison with previously published CNVs as well as using a replication sample of 239 individuals, genotyped with Illumina 550K arrays. We also established a new evaluation procedure that employs the fact that related individuals are expected to share their CNVs more frequently than randomly selected individuals. The ability to detect both rare and common CNVs provides a valuable resource that will facilitate association studies exploring potential phenotypic associations with CNVs. CONCLUSION: Our new methodologies for CNV detection and their evaluation will help in extracting additional information from the large amount of SNP-genotyping data on various cohorts and use this to explore structural variants and their impact on complex traits. BioMed Central 2012-06-15 /pmc/articles/PMC3464625/ /pubmed/22702538 http://dx.doi.org/10.1186/1471-2164-13-241 Text en Copyright ©2012 Valsesia et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Valsesia, Armand
Stevenson, Brian J
Waterworth, Dawn
Mooser, Vincent
Vollenweider, Peter
Waeber, Gérard
Jongeneel, C Victor
Beckmann, Jacques S
Kutalik, Zoltán
Bergmann, Sven
Identification and validation of copy number variants using SNP genotyping arrays from a large clinical cohort
title Identification and validation of copy number variants using SNP genotyping arrays from a large clinical cohort
title_full Identification and validation of copy number variants using SNP genotyping arrays from a large clinical cohort
title_fullStr Identification and validation of copy number variants using SNP genotyping arrays from a large clinical cohort
title_full_unstemmed Identification and validation of copy number variants using SNP genotyping arrays from a large clinical cohort
title_short Identification and validation of copy number variants using SNP genotyping arrays from a large clinical cohort
title_sort identification and validation of copy number variants using snp genotyping arrays from a large clinical cohort
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3464625/
https://www.ncbi.nlm.nih.gov/pubmed/22702538
http://dx.doi.org/10.1186/1471-2164-13-241
work_keys_str_mv AT valsesiaarmand identificationandvalidationofcopynumbervariantsusingsnpgenotypingarraysfromalargeclinicalcohort
AT stevensonbrianj identificationandvalidationofcopynumbervariantsusingsnpgenotypingarraysfromalargeclinicalcohort
AT waterworthdawn identificationandvalidationofcopynumbervariantsusingsnpgenotypingarraysfromalargeclinicalcohort
AT mooservincent identificationandvalidationofcopynumbervariantsusingsnpgenotypingarraysfromalargeclinicalcohort
AT vollenweiderpeter identificationandvalidationofcopynumbervariantsusingsnpgenotypingarraysfromalargeclinicalcohort
AT waebergerard identificationandvalidationofcopynumbervariantsusingsnpgenotypingarraysfromalargeclinicalcohort
AT jongeneelcvictor identificationandvalidationofcopynumbervariantsusingsnpgenotypingarraysfromalargeclinicalcohort
AT beckmannjacquess identificationandvalidationofcopynumbervariantsusingsnpgenotypingarraysfromalargeclinicalcohort
AT kutalikzoltan identificationandvalidationofcopynumbervariantsusingsnpgenotypingarraysfromalargeclinicalcohort
AT bergmannsven identificationandvalidationofcopynumbervariantsusingsnpgenotypingarraysfromalargeclinicalcohort