Cargando…

A remark on copy number variation detection methods

Copy number variations (CNVs) are gain and loss of DNA sequence of a genome. High throughput platforms such as microarrays and next generation sequencing technologies (NGS) have been applied for genome wide copy number losses. Although progress has been made in both approaches, the accuracy and cons...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Shuo, Dou, Xialiang, Gao, Ruiqi, Ge, Xinzhou, Qian, Minping, Wan, Lin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5922522/
https://www.ncbi.nlm.nih.gov/pubmed/29702671
http://dx.doi.org/10.1371/journal.pone.0196226
_version_ 1783318211679223808
author Li, Shuo
Dou, Xialiang
Gao, Ruiqi
Ge, Xinzhou
Qian, Minping
Wan, Lin
author_facet Li, Shuo
Dou, Xialiang
Gao, Ruiqi
Ge, Xinzhou
Qian, Minping
Wan, Lin
author_sort Li, Shuo
collection PubMed
description Copy number variations (CNVs) are gain and loss of DNA sequence of a genome. High throughput platforms such as microarrays and next generation sequencing technologies (NGS) have been applied for genome wide copy number losses. Although progress has been made in both approaches, the accuracy and consistency of CNV calling from the two platforms remain in dispute. In this study, we perform a deep analysis on copy number losses on 254 human DNA samples, which have both SNP microarray data and NGS data publicly available from Hapmap Project and 1000 Genomes Project respectively. We show that the copy number losses reported from Hapmap Project and 1000 Genome Project only have < 30% overlap, while these reports are required to have cross-platform (e.g. PCR, microarray and high-throughput sequencing) experimental supporting by their corresponding projects, even though state-of-art calling methods were employed. On the other hand, copy number losses are found directly from HapMap microarray data by an accurate algorithm, i.e. CNVhac, almost all of which have lower read mapping depth in NGS data; furthermore, 88% of which can be supported by the sequences with breakpoint in NGS data. Our results suggest the ability of microarray calling CNVs and the possible introduction of false negatives from the unessential requirement of the additional cross-platform supporting. The inconsistency of CNV reports from Hapmap Project and 1000 Genomes Project might result from the inadequate information containing in microarray data, the inconsistent detection criteria, or the filtration effect of cross-platform supporting. The statistical test on CNVs called from CNVhac show that the microarray data can offer reliable CNV reports, and majority of CNV candidates can be confirmed by raw sequences. Therefore, the CNV candidates given by a good caller could be highly reliable without cross-platform supporting, so additional experimental information should be applied in need instead of necessarily.
format Online
Article
Text
id pubmed-5922522
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-59225222018-05-11 A remark on copy number variation detection methods Li, Shuo Dou, Xialiang Gao, Ruiqi Ge, Xinzhou Qian, Minping Wan, Lin PLoS One Research Article Copy number variations (CNVs) are gain and loss of DNA sequence of a genome. High throughput platforms such as microarrays and next generation sequencing technologies (NGS) have been applied for genome wide copy number losses. Although progress has been made in both approaches, the accuracy and consistency of CNV calling from the two platforms remain in dispute. In this study, we perform a deep analysis on copy number losses on 254 human DNA samples, which have both SNP microarray data and NGS data publicly available from Hapmap Project and 1000 Genomes Project respectively. We show that the copy number losses reported from Hapmap Project and 1000 Genome Project only have < 30% overlap, while these reports are required to have cross-platform (e.g. PCR, microarray and high-throughput sequencing) experimental supporting by their corresponding projects, even though state-of-art calling methods were employed. On the other hand, copy number losses are found directly from HapMap microarray data by an accurate algorithm, i.e. CNVhac, almost all of which have lower read mapping depth in NGS data; furthermore, 88% of which can be supported by the sequences with breakpoint in NGS data. Our results suggest the ability of microarray calling CNVs and the possible introduction of false negatives from the unessential requirement of the additional cross-platform supporting. The inconsistency of CNV reports from Hapmap Project and 1000 Genomes Project might result from the inadequate information containing in microarray data, the inconsistent detection criteria, or the filtration effect of cross-platform supporting. The statistical test on CNVs called from CNVhac show that the microarray data can offer reliable CNV reports, and majority of CNV candidates can be confirmed by raw sequences. Therefore, the CNV candidates given by a good caller could be highly reliable without cross-platform supporting, so additional experimental information should be applied in need instead of necessarily. Public Library of Science 2018-04-27 /pmc/articles/PMC5922522/ /pubmed/29702671 http://dx.doi.org/10.1371/journal.pone.0196226 Text en © 2018 Li et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Li, Shuo
Dou, Xialiang
Gao, Ruiqi
Ge, Xinzhou
Qian, Minping
Wan, Lin
A remark on copy number variation detection methods
title A remark on copy number variation detection methods
title_full A remark on copy number variation detection methods
title_fullStr A remark on copy number variation detection methods
title_full_unstemmed A remark on copy number variation detection methods
title_short A remark on copy number variation detection methods
title_sort remark on copy number variation detection methods
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5922522/
https://www.ncbi.nlm.nih.gov/pubmed/29702671
http://dx.doi.org/10.1371/journal.pone.0196226
work_keys_str_mv AT lishuo aremarkoncopynumbervariationdetectionmethods
AT douxialiang aremarkoncopynumbervariationdetectionmethods
AT gaoruiqi aremarkoncopynumbervariationdetectionmethods
AT gexinzhou aremarkoncopynumbervariationdetectionmethods
AT qianminping aremarkoncopynumbervariationdetectionmethods
AT wanlin aremarkoncopynumbervariationdetectionmethods
AT lishuo remarkoncopynumbervariationdetectionmethods
AT douxialiang remarkoncopynumbervariationdetectionmethods
AT gaoruiqi remarkoncopynumbervariationdetectionmethods
AT gexinzhou remarkoncopynumbervariationdetectionmethods
AT qianminping remarkoncopynumbervariationdetectionmethods
AT wanlin remarkoncopynumbervariationdetectionmethods