Cargando…

PennCNV in whole-genome sequencing data

BACKGROUND: The use of high-throughput sequencing data has improved the results of genomic analysis due to the resolution of mapping algorithms. Although several tools for copy-number variation calling in whole genome sequencing have been published, the noisy nature of sequencing data is still a lim...

Descripción completa

Detalles Bibliográficos
Autores principales: de Araújo Lima, Leandro, Wang, Kai
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5629549/
https://www.ncbi.nlm.nih.gov/pubmed/28984186
http://dx.doi.org/10.1186/s12859-017-1802-x
_version_ 1783269064726020096
author de Araújo Lima, Leandro
Wang, Kai
author_facet de Araújo Lima, Leandro
Wang, Kai
author_sort de Araújo Lima, Leandro
collection PubMed
description BACKGROUND: The use of high-throughput sequencing data has improved the results of genomic analysis due to the resolution of mapping algorithms. Although several tools for copy-number variation calling in whole genome sequencing have been published, the noisy nature of sequencing data is still a limitation for accuracy and concordance among such tools. To assess the performance of PennCNV original algorithm for array data in whole genome sequencing data, we processed mapping (BAM) files to extract coverage, representing log R ratio (LRR) of signal intensity, and B allele frequency (BAF). RESULTS: We used high quality sample NA12878 from the recently reported NIST database and created 10 artificial samples with several CNVs spread along all chromosomes. We compared PennCNV-Seq with other tools with general deletions and duplications, as well as for different number of copies and copy-neutral loss-of-heterozygosity (LOH). CONCLUSION: PennCNV-Seq was able to find correct CNVs and can be integrated in existing CNV calling pipelines to report accurately the number of copies in specific genomic regions.
format Online
Article
Text
id pubmed-5629549
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-56295492017-10-13 PennCNV in whole-genome sequencing data de Araújo Lima, Leandro Wang, Kai BMC Bioinformatics Research BACKGROUND: The use of high-throughput sequencing data has improved the results of genomic analysis due to the resolution of mapping algorithms. Although several tools for copy-number variation calling in whole genome sequencing have been published, the noisy nature of sequencing data is still a limitation for accuracy and concordance among such tools. To assess the performance of PennCNV original algorithm for array data in whole genome sequencing data, we processed mapping (BAM) files to extract coverage, representing log R ratio (LRR) of signal intensity, and B allele frequency (BAF). RESULTS: We used high quality sample NA12878 from the recently reported NIST database and created 10 artificial samples with several CNVs spread along all chromosomes. We compared PennCNV-Seq with other tools with general deletions and duplications, as well as for different number of copies and copy-neutral loss-of-heterozygosity (LOH). CONCLUSION: PennCNV-Seq was able to find correct CNVs and can be integrated in existing CNV calling pipelines to report accurately the number of copies in specific genomic regions. BioMed Central 2017-10-03 /pmc/articles/PMC5629549/ /pubmed/28984186 http://dx.doi.org/10.1186/s12859-017-1802-x Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
de Araújo Lima, Leandro
Wang, Kai
PennCNV in whole-genome sequencing data
title PennCNV in whole-genome sequencing data
title_full PennCNV in whole-genome sequencing data
title_fullStr PennCNV in whole-genome sequencing data
title_full_unstemmed PennCNV in whole-genome sequencing data
title_short PennCNV in whole-genome sequencing data
title_sort penncnv in whole-genome sequencing data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5629549/
https://www.ncbi.nlm.nih.gov/pubmed/28984186
http://dx.doi.org/10.1186/s12859-017-1802-x
work_keys_str_mv AT dearaujolimaleandro penncnvinwholegenomesequencingdata
AT wangkai penncnvinwholegenomesequencingdata