Cargando…
PennCNV in whole-genome sequencing data
BACKGROUND: The use of high-throughput sequencing data has improved the results of genomic analysis due to the resolution of mapping algorithms. Although several tools for copy-number variation calling in whole genome sequencing have been published, the noisy nature of sequencing data is still a lim...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5629549/ https://www.ncbi.nlm.nih.gov/pubmed/28984186 http://dx.doi.org/10.1186/s12859-017-1802-x |
_version_ | 1783269064726020096 |
---|---|
author | de Araújo Lima, Leandro Wang, Kai |
author_facet | de Araújo Lima, Leandro Wang, Kai |
author_sort | de Araújo Lima, Leandro |
collection | PubMed |
description | BACKGROUND: The use of high-throughput sequencing data has improved the results of genomic analysis due to the resolution of mapping algorithms. Although several tools for copy-number variation calling in whole genome sequencing have been published, the noisy nature of sequencing data is still a limitation for accuracy and concordance among such tools. To assess the performance of PennCNV original algorithm for array data in whole genome sequencing data, we processed mapping (BAM) files to extract coverage, representing log R ratio (LRR) of signal intensity, and B allele frequency (BAF). RESULTS: We used high quality sample NA12878 from the recently reported NIST database and created 10 artificial samples with several CNVs spread along all chromosomes. We compared PennCNV-Seq with other tools with general deletions and duplications, as well as for different number of copies and copy-neutral loss-of-heterozygosity (LOH). CONCLUSION: PennCNV-Seq was able to find correct CNVs and can be integrated in existing CNV calling pipelines to report accurately the number of copies in specific genomic regions. |
format | Online Article Text |
id | pubmed-5629549 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-56295492017-10-13 PennCNV in whole-genome sequencing data de Araújo Lima, Leandro Wang, Kai BMC Bioinformatics Research BACKGROUND: The use of high-throughput sequencing data has improved the results of genomic analysis due to the resolution of mapping algorithms. Although several tools for copy-number variation calling in whole genome sequencing have been published, the noisy nature of sequencing data is still a limitation for accuracy and concordance among such tools. To assess the performance of PennCNV original algorithm for array data in whole genome sequencing data, we processed mapping (BAM) files to extract coverage, representing log R ratio (LRR) of signal intensity, and B allele frequency (BAF). RESULTS: We used high quality sample NA12878 from the recently reported NIST database and created 10 artificial samples with several CNVs spread along all chromosomes. We compared PennCNV-Seq with other tools with general deletions and duplications, as well as for different number of copies and copy-neutral loss-of-heterozygosity (LOH). CONCLUSION: PennCNV-Seq was able to find correct CNVs and can be integrated in existing CNV calling pipelines to report accurately the number of copies in specific genomic regions. BioMed Central 2017-10-03 /pmc/articles/PMC5629549/ /pubmed/28984186 http://dx.doi.org/10.1186/s12859-017-1802-x Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research de Araújo Lima, Leandro Wang, Kai PennCNV in whole-genome sequencing data |
title | PennCNV in whole-genome sequencing data |
title_full | PennCNV in whole-genome sequencing data |
title_fullStr | PennCNV in whole-genome sequencing data |
title_full_unstemmed | PennCNV in whole-genome sequencing data |
title_short | PennCNV in whole-genome sequencing data |
title_sort | penncnv in whole-genome sequencing data |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5629549/ https://www.ncbi.nlm.nih.gov/pubmed/28984186 http://dx.doi.org/10.1186/s12859-017-1802-x |
work_keys_str_mv | AT dearaujolimaleandro penncnvinwholegenomesequencingdata AT wangkai penncnvinwholegenomesequencingdata |