Cargando…

Family-Based Benchmarking of Copy Number Variation Detection Software

The analysis of structural variants, in particular of copy-number variations (CNVs), has proven valuable in unraveling the genetic basis of human diseases. Hence, a large number of algorithms have been developed for the detection of CNVs in SNP array signal intensity data. Using the European and Afr...

Descripción completa

Detalles Bibliográficos
Autores principales: Nutsua, Marcel Elie, Fischer, Annegret, Nebel, Almut, Hofmann, Sylvia, Schreiber, Stefan, Krawczak, Michael, Nothnagel, Michael
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4510559/
https://www.ncbi.nlm.nih.gov/pubmed/26197066
http://dx.doi.org/10.1371/journal.pone.0133465
_version_ 1782382192585867264
author Nutsua, Marcel Elie
Fischer, Annegret
Nebel, Almut
Hofmann, Sylvia
Schreiber, Stefan
Krawczak, Michael
Nothnagel, Michael
author_facet Nutsua, Marcel Elie
Fischer, Annegret
Nebel, Almut
Hofmann, Sylvia
Schreiber, Stefan
Krawczak, Michael
Nothnagel, Michael
author_sort Nutsua, Marcel Elie
collection PubMed
description The analysis of structural variants, in particular of copy-number variations (CNVs), has proven valuable in unraveling the genetic basis of human diseases. Hence, a large number of algorithms have been developed for the detection of CNVs in SNP array signal intensity data. Using the European and African HapMap trio data, we undertook a comparative evaluation of six commonly used CNV detection software tools, namely Affymetrix Power Tools (APT), QuantiSNP, PennCNV, GLAD, R-gada and VEGA, and assessed their level of pair-wise prediction concordance. The tool-specific CNV prediction accuracy was assessed in silico by way of intra-familial validation. Software tools differed greatly in terms of the number and length of the CNVs predicted as well as the number of markers included in a CNV. All software tools predicted substantially more deletions than duplications. Intra-familial validation revealed consistently low levels of prediction accuracy as measured by the proportion of validated CNVs (34-60%). Moreover, up to 20% of apparent family-based validations were found to be due to chance alone. Software using Hidden Markov models (HMM) showed a trend to predict fewer CNVs than segmentation-based algorithms albeit with greater validity. PennCNV yielded the highest prediction accuracy (60.9%). Finally, the pairwise concordance of CNV prediction was found to vary widely with the software tools involved. We recommend HMM-based software, in particular PennCNV, rather than segmentation-based algorithms when validity is the primary concern of CNV detection. QuantiSNP may be used as an additional tool to detect sets of CNVs not detectable by the other tools. Our study also reemphasizes the need for laboratory-based validation, such as qPCR, of CNVs predicted in silico.
format Online
Article
Text
id pubmed-4510559
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-45105592015-07-24 Family-Based Benchmarking of Copy Number Variation Detection Software Nutsua, Marcel Elie Fischer, Annegret Nebel, Almut Hofmann, Sylvia Schreiber, Stefan Krawczak, Michael Nothnagel, Michael PLoS One Research Article The analysis of structural variants, in particular of copy-number variations (CNVs), has proven valuable in unraveling the genetic basis of human diseases. Hence, a large number of algorithms have been developed for the detection of CNVs in SNP array signal intensity data. Using the European and African HapMap trio data, we undertook a comparative evaluation of six commonly used CNV detection software tools, namely Affymetrix Power Tools (APT), QuantiSNP, PennCNV, GLAD, R-gada and VEGA, and assessed their level of pair-wise prediction concordance. The tool-specific CNV prediction accuracy was assessed in silico by way of intra-familial validation. Software tools differed greatly in terms of the number and length of the CNVs predicted as well as the number of markers included in a CNV. All software tools predicted substantially more deletions than duplications. Intra-familial validation revealed consistently low levels of prediction accuracy as measured by the proportion of validated CNVs (34-60%). Moreover, up to 20% of apparent family-based validations were found to be due to chance alone. Software using Hidden Markov models (HMM) showed a trend to predict fewer CNVs than segmentation-based algorithms albeit with greater validity. PennCNV yielded the highest prediction accuracy (60.9%). Finally, the pairwise concordance of CNV prediction was found to vary widely with the software tools involved. We recommend HMM-based software, in particular PennCNV, rather than segmentation-based algorithms when validity is the primary concern of CNV detection. QuantiSNP may be used as an additional tool to detect sets of CNVs not detectable by the other tools. Our study also reemphasizes the need for laboratory-based validation, such as qPCR, of CNVs predicted in silico. Public Library of Science 2015-07-21 /pmc/articles/PMC4510559/ /pubmed/26197066 http://dx.doi.org/10.1371/journal.pone.0133465 Text en © 2015 Nutsua et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Nutsua, Marcel Elie
Fischer, Annegret
Nebel, Almut
Hofmann, Sylvia
Schreiber, Stefan
Krawczak, Michael
Nothnagel, Michael
Family-Based Benchmarking of Copy Number Variation Detection Software
title Family-Based Benchmarking of Copy Number Variation Detection Software
title_full Family-Based Benchmarking of Copy Number Variation Detection Software
title_fullStr Family-Based Benchmarking of Copy Number Variation Detection Software
title_full_unstemmed Family-Based Benchmarking of Copy Number Variation Detection Software
title_short Family-Based Benchmarking of Copy Number Variation Detection Software
title_sort family-based benchmarking of copy number variation detection software
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4510559/
https://www.ncbi.nlm.nih.gov/pubmed/26197066
http://dx.doi.org/10.1371/journal.pone.0133465
work_keys_str_mv AT nutsuamarcelelie familybasedbenchmarkingofcopynumbervariationdetectionsoftware
AT fischerannegret familybasedbenchmarkingofcopynumbervariationdetectionsoftware
AT nebelalmut familybasedbenchmarkingofcopynumbervariationdetectionsoftware
AT hofmannsylvia familybasedbenchmarkingofcopynumbervariationdetectionsoftware
AT schreiberstefan familybasedbenchmarkingofcopynumbervariationdetectionsoftware
AT krawczakmichael familybasedbenchmarkingofcopynumbervariationdetectionsoftware
AT nothnagelmichael familybasedbenchmarkingofcopynumbervariationdetectionsoftware