Cargando…

Evaluation of copy number variation detection for a SNP array platform

BACKGROUND: Copy Number Variations (CNVs) are usually inferred from Single Nucleotide Polymorphism (SNP) arrays by use of some software packages based on given algorithms. However, there is no clear understanding of the performance of these software packages; it is therefore difficult to select one...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Xin, Du, Renqian, Li, Shilin, Zhang, Feng, Jin, Li, Wang, Hongyan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4015297/
https://www.ncbi.nlm.nih.gov/pubmed/24555668
http://dx.doi.org/10.1186/1471-2105-15-50
_version_ 1782315314186289152
author Zhang, Xin
Du, Renqian
Li, Shilin
Zhang, Feng
Jin, Li
Wang, Hongyan
author_facet Zhang, Xin
Du, Renqian
Li, Shilin
Zhang, Feng
Jin, Li
Wang, Hongyan
author_sort Zhang, Xin
collection PubMed
description BACKGROUND: Copy Number Variations (CNVs) are usually inferred from Single Nucleotide Polymorphism (SNP) arrays by use of some software packages based on given algorithms. However, there is no clear understanding of the performance of these software packages; it is therefore difficult to select one or several software packages for CNV detection based on the SNP array platform. We selected four publicly available software packages designed for CNV calling from an Affymetrix SNP array, including Birdsuite, dChip, Genotyping Console (GTC) and PennCNV. The publicly available dataset generated by Array-based Comparative Genomic Hybridization (CGH), with a resolution of 24 million probes per sample, was considered to be the “gold standard”. Compared with the CGH-based dataset, the success rate, average stability rate, sensitivity, consistence and reproducibility of these four software packages were assessed compared with the “gold standard”. Specially, we also compared the efficiency of detecting CNVs simultaneously by two, three and all of the software packages with that by a single software package. RESULTS: Simply from the quantity of the detected CNVs, Birdsuite detected the most while GTC detected the least. We found that Birdsuite and dChip had obvious detecting bias. And GTC seemed to be inferior because of the least amount of CNVs it detected. Thereafter we investigated the detection consistency produced by one certain software package and the rest three software suits. We found that the consistency of dChip was the lowest while GTC was the highest. Compared with the CNVs detecting result of CGH, in the matching group, GTC called the most matching CNVs, PennCNV-Affy ranked second. In the non-overlapping group, GTC called the least CNVs. With regards to the reproducibility of CNV calling, larger CNVs were usually replicated better. PennCNV-Affy shows the best consistency while Birdsuite shows the poorest. CONCLUSION: We found that PennCNV outperformed the other three packages in the sensitivity and specificity of CNV calling. Obviously, each calling method had its own limitations and advantages for different data analysis. Therefore, the optimized calling methods might be identified using multiple algorithms to evaluate the concordance and discordance of SNP array-based CNV calling.
format Online
Article
Text
id pubmed-4015297
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-40152972014-05-10 Evaluation of copy number variation detection for a SNP array platform Zhang, Xin Du, Renqian Li, Shilin Zhang, Feng Jin, Li Wang, Hongyan BMC Bioinformatics Software BACKGROUND: Copy Number Variations (CNVs) are usually inferred from Single Nucleotide Polymorphism (SNP) arrays by use of some software packages based on given algorithms. However, there is no clear understanding of the performance of these software packages; it is therefore difficult to select one or several software packages for CNV detection based on the SNP array platform. We selected four publicly available software packages designed for CNV calling from an Affymetrix SNP array, including Birdsuite, dChip, Genotyping Console (GTC) and PennCNV. The publicly available dataset generated by Array-based Comparative Genomic Hybridization (CGH), with a resolution of 24 million probes per sample, was considered to be the “gold standard”. Compared with the CGH-based dataset, the success rate, average stability rate, sensitivity, consistence and reproducibility of these four software packages were assessed compared with the “gold standard”. Specially, we also compared the efficiency of detecting CNVs simultaneously by two, three and all of the software packages with that by a single software package. RESULTS: Simply from the quantity of the detected CNVs, Birdsuite detected the most while GTC detected the least. We found that Birdsuite and dChip had obvious detecting bias. And GTC seemed to be inferior because of the least amount of CNVs it detected. Thereafter we investigated the detection consistency produced by one certain software package and the rest three software suits. We found that the consistency of dChip was the lowest while GTC was the highest. Compared with the CNVs detecting result of CGH, in the matching group, GTC called the most matching CNVs, PennCNV-Affy ranked second. In the non-overlapping group, GTC called the least CNVs. With regards to the reproducibility of CNV calling, larger CNVs were usually replicated better. PennCNV-Affy shows the best consistency while Birdsuite shows the poorest. CONCLUSION: We found that PennCNV outperformed the other three packages in the sensitivity and specificity of CNV calling. Obviously, each calling method had its own limitations and advantages for different data analysis. Therefore, the optimized calling methods might be identified using multiple algorithms to evaluate the concordance and discordance of SNP array-based CNV calling. BioMed Central 2014-02-21 /pmc/articles/PMC4015297/ /pubmed/24555668 http://dx.doi.org/10.1186/1471-2105-15-50 Text en Copyright © 2014 Zhang et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.
spellingShingle Software
Zhang, Xin
Du, Renqian
Li, Shilin
Zhang, Feng
Jin, Li
Wang, Hongyan
Evaluation of copy number variation detection for a SNP array platform
title Evaluation of copy number variation detection for a SNP array platform
title_full Evaluation of copy number variation detection for a SNP array platform
title_fullStr Evaluation of copy number variation detection for a SNP array platform
title_full_unstemmed Evaluation of copy number variation detection for a SNP array platform
title_short Evaluation of copy number variation detection for a SNP array platform
title_sort evaluation of copy number variation detection for a snp array platform
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4015297/
https://www.ncbi.nlm.nih.gov/pubmed/24555668
http://dx.doi.org/10.1186/1471-2105-15-50
work_keys_str_mv AT zhangxin evaluationofcopynumbervariationdetectionforasnparrayplatform
AT durenqian evaluationofcopynumbervariationdetectionforasnparrayplatform
AT lishilin evaluationofcopynumbervariationdetectionforasnparrayplatform
AT zhangfeng evaluationofcopynumbervariationdetectionforasnparrayplatform
AT jinli evaluationofcopynumbervariationdetectionforasnparrayplatform
AT wanghongyan evaluationofcopynumbervariationdetectionforasnparrayplatform