Cargando…

Comprehensive performance comparison of high-resolution array platforms for genome-wide Copy Number Variation (CNV) analysis in humans

BACKGROUND: High-resolution microarray technology is routinely used in basic research and clinical practice to efficiently detect copy number variants (CNVs) across the entire human genome. A new generation of arrays combining high probe densities with optimized designs will comprise essential tools...

Descripción completa

Detalles Bibliográficos
Autores principales: Haraksingh, Rajini R., Abyzov, Alexej, Urban, Alexander Eckehart
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5402652/
https://www.ncbi.nlm.nih.gov/pubmed/28438122
http://dx.doi.org/10.1186/s12864-017-3658-x
_version_ 1783231269866307584
author Haraksingh, Rajini R.
Abyzov, Alexej
Urban, Alexander Eckehart
author_facet Haraksingh, Rajini R.
Abyzov, Alexej
Urban, Alexander Eckehart
author_sort Haraksingh, Rajini R.
collection PubMed
description BACKGROUND: High-resolution microarray technology is routinely used in basic research and clinical practice to efficiently detect copy number variants (CNVs) across the entire human genome. A new generation of arrays combining high probe densities with optimized designs will comprise essential tools for genome analysis in the coming years. We systematically compared the genome-wide CNV detection power of all 17 available array designs from the Affymetrix, Agilent, and Illumina platforms by hybridizing the well-characterized genome of 1000 Genomes Project subject NA12878 to all arrays, and performing data analysis using both manufacturer-recommended and platform-independent software. We benchmarked the resulting CNV call sets from each array using a gold standard set of CNVs for this genome derived from 1000 Genomes Project whole genome sequencing data. RESULTS: The arrays tested comprise both SNP and aCGH platforms with varying designs and contain between ~0.5 to ~4.6 million probes. Across the arrays CNV detection varied widely in number of CNV calls (4–489), CNV size range (~40 bp to ~8 Mbp), and percentage of non-validated CNVs (0–86%). We discovered strikingly strong effects of specific array design principles on performance. For example, some SNP array designs with the largest numbers of probes and extensive exonic coverage produced a considerable number of CNV calls that could not be validated, compared to designs with probe numbers that are sometimes an order of magnitude smaller. This effect was only partially ameliorated using different analysis software and optimizing data analysis parameters. CONCLUSIONS: High-resolution microarrays will continue to be used as reliable, cost- and time-efficient tools for CNV analysis. However, different applications tolerate different limitations in CNV detection. Our study quantified how these arrays differ in total number and size range of detected CNVs as well as sensitivity, and determined how each array balances these attributes. This analysis will inform appropriate array selection for future CNV studies, and allow better assessment of the CNV-analytical power of both published and ongoing array-based genomics studies. Furthermore, our findings emphasize the importance of concurrent use of multiple analysis algorithms and independent experimental validation in array-based CNV detection studies. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-017-3658-x) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5402652
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-54026522017-04-27 Comprehensive performance comparison of high-resolution array platforms for genome-wide Copy Number Variation (CNV) analysis in humans Haraksingh, Rajini R. Abyzov, Alexej Urban, Alexander Eckehart BMC Genomics Research Article BACKGROUND: High-resolution microarray technology is routinely used in basic research and clinical practice to efficiently detect copy number variants (CNVs) across the entire human genome. A new generation of arrays combining high probe densities with optimized designs will comprise essential tools for genome analysis in the coming years. We systematically compared the genome-wide CNV detection power of all 17 available array designs from the Affymetrix, Agilent, and Illumina platforms by hybridizing the well-characterized genome of 1000 Genomes Project subject NA12878 to all arrays, and performing data analysis using both manufacturer-recommended and platform-independent software. We benchmarked the resulting CNV call sets from each array using a gold standard set of CNVs for this genome derived from 1000 Genomes Project whole genome sequencing data. RESULTS: The arrays tested comprise both SNP and aCGH platforms with varying designs and contain between ~0.5 to ~4.6 million probes. Across the arrays CNV detection varied widely in number of CNV calls (4–489), CNV size range (~40 bp to ~8 Mbp), and percentage of non-validated CNVs (0–86%). We discovered strikingly strong effects of specific array design principles on performance. For example, some SNP array designs with the largest numbers of probes and extensive exonic coverage produced a considerable number of CNV calls that could not be validated, compared to designs with probe numbers that are sometimes an order of magnitude smaller. This effect was only partially ameliorated using different analysis software and optimizing data analysis parameters. CONCLUSIONS: High-resolution microarrays will continue to be used as reliable, cost- and time-efficient tools for CNV analysis. However, different applications tolerate different limitations in CNV detection. Our study quantified how these arrays differ in total number and size range of detected CNVs as well as sensitivity, and determined how each array balances these attributes. This analysis will inform appropriate array selection for future CNV studies, and allow better assessment of the CNV-analytical power of both published and ongoing array-based genomics studies. Furthermore, our findings emphasize the importance of concurrent use of multiple analysis algorithms and independent experimental validation in array-based CNV detection studies. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-017-3658-x) contains supplementary material, which is available to authorized users. BioMed Central 2017-04-24 /pmc/articles/PMC5402652/ /pubmed/28438122 http://dx.doi.org/10.1186/s12864-017-3658-x Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Haraksingh, Rajini R.
Abyzov, Alexej
Urban, Alexander Eckehart
Comprehensive performance comparison of high-resolution array platforms for genome-wide Copy Number Variation (CNV) analysis in humans
title Comprehensive performance comparison of high-resolution array platforms for genome-wide Copy Number Variation (CNV) analysis in humans
title_full Comprehensive performance comparison of high-resolution array platforms for genome-wide Copy Number Variation (CNV) analysis in humans
title_fullStr Comprehensive performance comparison of high-resolution array platforms for genome-wide Copy Number Variation (CNV) analysis in humans
title_full_unstemmed Comprehensive performance comparison of high-resolution array platforms for genome-wide Copy Number Variation (CNV) analysis in humans
title_short Comprehensive performance comparison of high-resolution array platforms for genome-wide Copy Number Variation (CNV) analysis in humans
title_sort comprehensive performance comparison of high-resolution array platforms for genome-wide copy number variation (cnv) analysis in humans
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5402652/
https://www.ncbi.nlm.nih.gov/pubmed/28438122
http://dx.doi.org/10.1186/s12864-017-3658-x
work_keys_str_mv AT haraksinghrajinir comprehensiveperformancecomparisonofhighresolutionarrayplatformsforgenomewidecopynumbervariationcnvanalysisinhumans
AT abyzovalexej comprehensiveperformancecomparisonofhighresolutionarrayplatformsforgenomewidecopynumbervariationcnvanalysisinhumans
AT urbanalexandereckehart comprehensiveperformancecomparisonofhighresolutionarrayplatformsforgenomewidecopynumbervariationcnvanalysisinhumans