Cargando…
Improved Variant Calling Accuracy by Merging Replicates in Whole-Exome Sequencing Studies
In large scale population-based whole-exome sequencing (WES) studies, there are some samples occasionally sequenced two or more times due to a variety of reasons. To investigate how to efficiently utilize these duplicated sequencing data, we conducted comprehensive evaluation of variant calling stra...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi Publishing Corporation
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4137624/ https://www.ncbi.nlm.nih.gov/pubmed/25162009 http://dx.doi.org/10.1155/2014/319534 |
_version_ | 1782331130863681536 |
---|---|
author | Zhang, Yanfeng Li, Bingshan Li, Chun Cai, Qiuyin Zheng, Wei Long, Jirong |
author_facet | Zhang, Yanfeng Li, Bingshan Li, Chun Cai, Qiuyin Zheng, Wei Long, Jirong |
author_sort | Zhang, Yanfeng |
collection | PubMed |
description | In large scale population-based whole-exome sequencing (WES) studies, there are some samples occasionally sequenced two or more times due to a variety of reasons. To investigate how to efficiently utilize these duplicated sequencing data, we conducted comprehensive evaluation of variant calling strategies. 92 samples subjected to WES twice were selected from a large population study. These 92 duplicated samples were divided into two groups: group H consisting of the higher sequencing depth for each subject and group L consisting of the lower depth for each subject. The merged samples for each subject were put in a third group M. Using the GATK multisample toolkit, we compared variant calling accuracy among three strategies. Hierarchical clustering analysis indicated that the two replicates for each subject showed high homogeneity. The comparative analyses on the basis of heterozygous-homozygous ratio (Hete/Homo), transition-transversion ratio (Ti/Tv), and overlapping rate with the 1000 Genomes Project consistently showed that the data quality of the SNPs detected from the M group was more accurate than that of SNPs detected from the H and L groups. These results suggested that merging homogeneous duplicated exomes instead of using one of them could improve variant calling accuracy. |
format | Online Article Text |
id | pubmed-4137624 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Hindawi Publishing Corporation |
record_format | MEDLINE/PubMed |
spelling | pubmed-41376242014-08-26 Improved Variant Calling Accuracy by Merging Replicates in Whole-Exome Sequencing Studies Zhang, Yanfeng Li, Bingshan Li, Chun Cai, Qiuyin Zheng, Wei Long, Jirong Biomed Res Int Research Article In large scale population-based whole-exome sequencing (WES) studies, there are some samples occasionally sequenced two or more times due to a variety of reasons. To investigate how to efficiently utilize these duplicated sequencing data, we conducted comprehensive evaluation of variant calling strategies. 92 samples subjected to WES twice were selected from a large population study. These 92 duplicated samples were divided into two groups: group H consisting of the higher sequencing depth for each subject and group L consisting of the lower depth for each subject. The merged samples for each subject were put in a third group M. Using the GATK multisample toolkit, we compared variant calling accuracy among three strategies. Hierarchical clustering analysis indicated that the two replicates for each subject showed high homogeneity. The comparative analyses on the basis of heterozygous-homozygous ratio (Hete/Homo), transition-transversion ratio (Ti/Tv), and overlapping rate with the 1000 Genomes Project consistently showed that the data quality of the SNPs detected from the M group was more accurate than that of SNPs detected from the H and L groups. These results suggested that merging homogeneous duplicated exomes instead of using one of them could improve variant calling accuracy. Hindawi Publishing Corporation 2014 2014-08-04 /pmc/articles/PMC4137624/ /pubmed/25162009 http://dx.doi.org/10.1155/2014/319534 Text en Copyright © 2014 Yanfeng Zhang et al. https://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Zhang, Yanfeng Li, Bingshan Li, Chun Cai, Qiuyin Zheng, Wei Long, Jirong Improved Variant Calling Accuracy by Merging Replicates in Whole-Exome Sequencing Studies |
title | Improved Variant Calling Accuracy by Merging Replicates in Whole-Exome Sequencing Studies |
title_full | Improved Variant Calling Accuracy by Merging Replicates in Whole-Exome Sequencing Studies |
title_fullStr | Improved Variant Calling Accuracy by Merging Replicates in Whole-Exome Sequencing Studies |
title_full_unstemmed | Improved Variant Calling Accuracy by Merging Replicates in Whole-Exome Sequencing Studies |
title_short | Improved Variant Calling Accuracy by Merging Replicates in Whole-Exome Sequencing Studies |
title_sort | improved variant calling accuracy by merging replicates in whole-exome sequencing studies |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4137624/ https://www.ncbi.nlm.nih.gov/pubmed/25162009 http://dx.doi.org/10.1155/2014/319534 |
work_keys_str_mv | AT zhangyanfeng improvedvariantcallingaccuracybymergingreplicatesinwholeexomesequencingstudies AT libingshan improvedvariantcallingaccuracybymergingreplicatesinwholeexomesequencingstudies AT lichun improvedvariantcallingaccuracybymergingreplicatesinwholeexomesequencingstudies AT caiqiuyin improvedvariantcallingaccuracybymergingreplicatesinwholeexomesequencingstudies AT zhengwei improvedvariantcallingaccuracybymergingreplicatesinwholeexomesequencingstudies AT longjirong improvedvariantcallingaccuracybymergingreplicatesinwholeexomesequencingstudies |