Cargando…

Improved Variant Calling Accuracy by Merging Replicates in Whole-Exome Sequencing Studies

In large scale population-based whole-exome sequencing (WES) studies, there are some samples occasionally sequenced two or more times due to a variety of reasons. To investigate how to efficiently utilize these duplicated sequencing data, we conducted comprehensive evaluation of variant calling stra...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Yanfeng, Li, Bingshan, Li, Chun, Cai, Qiuyin, Zheng, Wei, Long, Jirong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi Publishing Corporation 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4137624/
https://www.ncbi.nlm.nih.gov/pubmed/25162009
http://dx.doi.org/10.1155/2014/319534
_version_ 1782331130863681536
author Zhang, Yanfeng
Li, Bingshan
Li, Chun
Cai, Qiuyin
Zheng, Wei
Long, Jirong
author_facet Zhang, Yanfeng
Li, Bingshan
Li, Chun
Cai, Qiuyin
Zheng, Wei
Long, Jirong
author_sort Zhang, Yanfeng
collection PubMed
description In large scale population-based whole-exome sequencing (WES) studies, there are some samples occasionally sequenced two or more times due to a variety of reasons. To investigate how to efficiently utilize these duplicated sequencing data, we conducted comprehensive evaluation of variant calling strategies. 92 samples subjected to WES twice were selected from a large population study. These 92 duplicated samples were divided into two groups: group H consisting of the higher sequencing depth for each subject and group L consisting of the lower depth for each subject. The merged samples for each subject were put in a third group M. Using the GATK multisample toolkit, we compared variant calling accuracy among three strategies. Hierarchical clustering analysis indicated that the two replicates for each subject showed high homogeneity. The comparative analyses on the basis of heterozygous-homozygous ratio (Hete/Homo), transition-transversion ratio (Ti/Tv), and overlapping rate with the 1000 Genomes Project consistently showed that the data quality of the SNPs detected from the M group was more accurate than that of SNPs detected from the H and L groups. These results suggested that merging homogeneous duplicated exomes instead of using one of them could improve variant calling accuracy.
format Online
Article
Text
id pubmed-4137624
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Hindawi Publishing Corporation
record_format MEDLINE/PubMed
spelling pubmed-41376242014-08-26 Improved Variant Calling Accuracy by Merging Replicates in Whole-Exome Sequencing Studies Zhang, Yanfeng Li, Bingshan Li, Chun Cai, Qiuyin Zheng, Wei Long, Jirong Biomed Res Int Research Article In large scale population-based whole-exome sequencing (WES) studies, there are some samples occasionally sequenced two or more times due to a variety of reasons. To investigate how to efficiently utilize these duplicated sequencing data, we conducted comprehensive evaluation of variant calling strategies. 92 samples subjected to WES twice were selected from a large population study. These 92 duplicated samples were divided into two groups: group H consisting of the higher sequencing depth for each subject and group L consisting of the lower depth for each subject. The merged samples for each subject were put in a third group M. Using the GATK multisample toolkit, we compared variant calling accuracy among three strategies. Hierarchical clustering analysis indicated that the two replicates for each subject showed high homogeneity. The comparative analyses on the basis of heterozygous-homozygous ratio (Hete/Homo), transition-transversion ratio (Ti/Tv), and overlapping rate with the 1000 Genomes Project consistently showed that the data quality of the SNPs detected from the M group was more accurate than that of SNPs detected from the H and L groups. These results suggested that merging homogeneous duplicated exomes instead of using one of them could improve variant calling accuracy. Hindawi Publishing Corporation 2014 2014-08-04 /pmc/articles/PMC4137624/ /pubmed/25162009 http://dx.doi.org/10.1155/2014/319534 Text en Copyright © 2014 Yanfeng Zhang et al. https://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Zhang, Yanfeng
Li, Bingshan
Li, Chun
Cai, Qiuyin
Zheng, Wei
Long, Jirong
Improved Variant Calling Accuracy by Merging Replicates in Whole-Exome Sequencing Studies
title Improved Variant Calling Accuracy by Merging Replicates in Whole-Exome Sequencing Studies
title_full Improved Variant Calling Accuracy by Merging Replicates in Whole-Exome Sequencing Studies
title_fullStr Improved Variant Calling Accuracy by Merging Replicates in Whole-Exome Sequencing Studies
title_full_unstemmed Improved Variant Calling Accuracy by Merging Replicates in Whole-Exome Sequencing Studies
title_short Improved Variant Calling Accuracy by Merging Replicates in Whole-Exome Sequencing Studies
title_sort improved variant calling accuracy by merging replicates in whole-exome sequencing studies
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4137624/
https://www.ncbi.nlm.nih.gov/pubmed/25162009
http://dx.doi.org/10.1155/2014/319534
work_keys_str_mv AT zhangyanfeng improvedvariantcallingaccuracybymergingreplicatesinwholeexomesequencingstudies
AT libingshan improvedvariantcallingaccuracybymergingreplicatesinwholeexomesequencingstudies
AT lichun improvedvariantcallingaccuracybymergingreplicatesinwholeexomesequencingstudies
AT caiqiuyin improvedvariantcallingaccuracybymergingreplicatesinwholeexomesequencingstudies
AT zhengwei improvedvariantcallingaccuracybymergingreplicatesinwholeexomesequencingstudies
AT longjirong improvedvariantcallingaccuracybymergingreplicatesinwholeexomesequencingstudies