Cargando…

Effect of Combining Multiple CNV Defining Algorithms on the Reliability of CNV Calls from SNP Genotyping Data

In addition to single-nucleotide polymorphisms (SNP), copy number variation (CNV) is a major component of human genetic diversity. Among many whole-genome analysis platforms, SNP arrays have been commonly used for genomewide CNV discovery. Recently, a number of CNV defining algorithms from SNP genot...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Soon-Young, Kim, Ji-Hong, Chung, Yeun-Jun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Korea Genome Organization 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3492655/
https://www.ncbi.nlm.nih.gov/pubmed/23166530
http://dx.doi.org/10.5808/GI.2012.10.3.194
_version_ 1782249149964484608
author Kim, Soon-Young
Kim, Ji-Hong
Chung, Yeun-Jun
author_facet Kim, Soon-Young
Kim, Ji-Hong
Chung, Yeun-Jun
author_sort Kim, Soon-Young
collection PubMed
description In addition to single-nucleotide polymorphisms (SNP), copy number variation (CNV) is a major component of human genetic diversity. Among many whole-genome analysis platforms, SNP arrays have been commonly used for genomewide CNV discovery. Recently, a number of CNV defining algorithms from SNP genotyping data have been developed; however, due to the fundamental limitation of SNP genotyping data for the measurement of signal intensity, there are still concerns regarding the possibility of false discovery or low sensitivity for detecting CNVs. In this study, we aimed to verify the effect of combining multiple CNV calling algorithms and set up the most reliable pipeline for CNV calling with Affymetrix Genomewide SNP 5.0 data. For this purpose, we selected the 3 most commonly used algorithms for CNV segmentation from SNP genotyping data, PennCNV, QuantiSNP; and BirdSuite. After defining the CNV loci using the 3 different algorithms, we assessed how many of them overlapped with each other, and we also validated the CNVs by genomic quantitative PCR. Through this analysis, we proposed that for reliable CNV-based genomewide association study using SNP array data, CNV calls must be performed with at least 3 different algorithms and that the CNVs consistently called from more than 2 algorithms must be used for association analysis, because they are more reliable than the CNVs called from a single algorithm. Our result will be helpful to set up the CNV analysis protocols for Affymetrix Genomewide SNP 5.0 genotyping data.
format Online
Article
Text
id pubmed-3492655
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Korea Genome Organization
record_format MEDLINE/PubMed
spelling pubmed-34926552012-11-19 Effect of Combining Multiple CNV Defining Algorithms on the Reliability of CNV Calls from SNP Genotyping Data Kim, Soon-Young Kim, Ji-Hong Chung, Yeun-Jun Genomics Inform Original Article In addition to single-nucleotide polymorphisms (SNP), copy number variation (CNV) is a major component of human genetic diversity. Among many whole-genome analysis platforms, SNP arrays have been commonly used for genomewide CNV discovery. Recently, a number of CNV defining algorithms from SNP genotyping data have been developed; however, due to the fundamental limitation of SNP genotyping data for the measurement of signal intensity, there are still concerns regarding the possibility of false discovery or low sensitivity for detecting CNVs. In this study, we aimed to verify the effect of combining multiple CNV calling algorithms and set up the most reliable pipeline for CNV calling with Affymetrix Genomewide SNP 5.0 data. For this purpose, we selected the 3 most commonly used algorithms for CNV segmentation from SNP genotyping data, PennCNV, QuantiSNP; and BirdSuite. After defining the CNV loci using the 3 different algorithms, we assessed how many of them overlapped with each other, and we also validated the CNVs by genomic quantitative PCR. Through this analysis, we proposed that for reliable CNV-based genomewide association study using SNP array data, CNV calls must be performed with at least 3 different algorithms and that the CNVs consistently called from more than 2 algorithms must be used for association analysis, because they are more reliable than the CNVs called from a single algorithm. Our result will be helpful to set up the CNV analysis protocols for Affymetrix Genomewide SNP 5.0 genotyping data. Korea Genome Organization 2012-09 2012-09-28 /pmc/articles/PMC3492655/ /pubmed/23166530 http://dx.doi.org/10.5808/GI.2012.10.3.194 Text en Copyright © 2012 by The Korea Genome Organization http://creativecommons.org/licenses/by-nc/3.0 It is identical to the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/).
spellingShingle Original Article
Kim, Soon-Young
Kim, Ji-Hong
Chung, Yeun-Jun
Effect of Combining Multiple CNV Defining Algorithms on the Reliability of CNV Calls from SNP Genotyping Data
title Effect of Combining Multiple CNV Defining Algorithms on the Reliability of CNV Calls from SNP Genotyping Data
title_full Effect of Combining Multiple CNV Defining Algorithms on the Reliability of CNV Calls from SNP Genotyping Data
title_fullStr Effect of Combining Multiple CNV Defining Algorithms on the Reliability of CNV Calls from SNP Genotyping Data
title_full_unstemmed Effect of Combining Multiple CNV Defining Algorithms on the Reliability of CNV Calls from SNP Genotyping Data
title_short Effect of Combining Multiple CNV Defining Algorithms on the Reliability of CNV Calls from SNP Genotyping Data
title_sort effect of combining multiple cnv defining algorithms on the reliability of cnv calls from snp genotyping data
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3492655/
https://www.ncbi.nlm.nih.gov/pubmed/23166530
http://dx.doi.org/10.5808/GI.2012.10.3.194
work_keys_str_mv AT kimsoonyoung effectofcombiningmultiplecnvdefiningalgorithmsonthereliabilityofcnvcallsfromsnpgenotypingdata
AT kimjihong effectofcombiningmultiplecnvdefiningalgorithmsonthereliabilityofcnvcallsfromsnpgenotypingdata
AT chungyeunjun effectofcombiningmultiplecnvdefiningalgorithmsonthereliabilityofcnvcallsfromsnpgenotypingdata