Cargando…
Reproducibility of Variant Calls in Replicate Next Generation Sequencing Experiments
Nucleotide alterations detected by next generation sequencing are not always true biological changes but could represent sequencing errors. Even highly accurate methods can yield substantial error rates when applied to millions of nucleotides. In this study, we examined the reproducibility of nucleo...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4489803/ https://www.ncbi.nlm.nih.gov/pubmed/26136146 http://dx.doi.org/10.1371/journal.pone.0119230 |
_version_ | 1782379421582229504 |
---|---|
author | Qi, Yuan Liu, Xiuping Liu, Chang-gong Wang, Bailing Hess, Kenneth R. Symmans, W. Fraser Shi, Weiwei Pusztai, Lajos |
author_facet | Qi, Yuan Liu, Xiuping Liu, Chang-gong Wang, Bailing Hess, Kenneth R. Symmans, W. Fraser Shi, Weiwei Pusztai, Lajos |
author_sort | Qi, Yuan |
collection | PubMed |
description | Nucleotide alterations detected by next generation sequencing are not always true biological changes but could represent sequencing errors. Even highly accurate methods can yield substantial error rates when applied to millions of nucleotides. In this study, we examined the reproducibility of nucleotide variant calls in replicate sequencing experiments of the same genomic DNA. We performed targeted sequencing of all known human protein kinase genes (kinome) (~3.2 Mb) using the SOLiD v4 platform. Seventeen breast cancer samples were sequenced in duplicate (n=14) or triplicate (n=3) to assess concordance of all calls and single nucleotide variant (SNV) calls. The concordance rates over the entire sequenced region were >99.99%, while the concordance rates for SNVs were 54.3-75.5%. There was substantial variation in basic sequencing metrics from experiment to experiment. The type of nucleotide substitution and genomic location of the variant had little impact on concordance but concordance increased with coverage level, variant allele count (VAC), variant allele frequency (VAF), variant allele quality and p-value of SNV-call. The most important determinants of concordance were VAC and VAF. Even using the highest stringency of QC metrics the reproducibility of SNV calls was around 80% suggesting that erroneous variant calling can be as high as 20-40% in a single experiment. The sequence data have been deposited into the European Genome-phenome Archive (EGA) with accession number EGAS00001000826. |
format | Online Article Text |
id | pubmed-4489803 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-44898032015-07-15 Reproducibility of Variant Calls in Replicate Next Generation Sequencing Experiments Qi, Yuan Liu, Xiuping Liu, Chang-gong Wang, Bailing Hess, Kenneth R. Symmans, W. Fraser Shi, Weiwei Pusztai, Lajos PLoS One Research Article Nucleotide alterations detected by next generation sequencing are not always true biological changes but could represent sequencing errors. Even highly accurate methods can yield substantial error rates when applied to millions of nucleotides. In this study, we examined the reproducibility of nucleotide variant calls in replicate sequencing experiments of the same genomic DNA. We performed targeted sequencing of all known human protein kinase genes (kinome) (~3.2 Mb) using the SOLiD v4 platform. Seventeen breast cancer samples were sequenced in duplicate (n=14) or triplicate (n=3) to assess concordance of all calls and single nucleotide variant (SNV) calls. The concordance rates over the entire sequenced region were >99.99%, while the concordance rates for SNVs were 54.3-75.5%. There was substantial variation in basic sequencing metrics from experiment to experiment. The type of nucleotide substitution and genomic location of the variant had little impact on concordance but concordance increased with coverage level, variant allele count (VAC), variant allele frequency (VAF), variant allele quality and p-value of SNV-call. The most important determinants of concordance were VAC and VAF. Even using the highest stringency of QC metrics the reproducibility of SNV calls was around 80% suggesting that erroneous variant calling can be as high as 20-40% in a single experiment. The sequence data have been deposited into the European Genome-phenome Archive (EGA) with accession number EGAS00001000826. Public Library of Science 2015-07-02 /pmc/articles/PMC4489803/ /pubmed/26136146 http://dx.doi.org/10.1371/journal.pone.0119230 Text en https://creativecommons.org/publicdomain/zero/1.0/ This is an open-access article distributed under the terms of the Creative Commons Public Domain declaration, which stipulates that, once placed in the public domain, this work may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. |
spellingShingle | Research Article Qi, Yuan Liu, Xiuping Liu, Chang-gong Wang, Bailing Hess, Kenneth R. Symmans, W. Fraser Shi, Weiwei Pusztai, Lajos Reproducibility of Variant Calls in Replicate Next Generation Sequencing Experiments |
title | Reproducibility of Variant Calls in Replicate Next Generation Sequencing Experiments |
title_full | Reproducibility of Variant Calls in Replicate Next Generation Sequencing Experiments |
title_fullStr | Reproducibility of Variant Calls in Replicate Next Generation Sequencing Experiments |
title_full_unstemmed | Reproducibility of Variant Calls in Replicate Next Generation Sequencing Experiments |
title_short | Reproducibility of Variant Calls in Replicate Next Generation Sequencing Experiments |
title_sort | reproducibility of variant calls in replicate next generation sequencing experiments |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4489803/ https://www.ncbi.nlm.nih.gov/pubmed/26136146 http://dx.doi.org/10.1371/journal.pone.0119230 |
work_keys_str_mv | AT qiyuan reproducibilityofvariantcallsinreplicatenextgenerationsequencingexperiments AT liuxiuping reproducibilityofvariantcallsinreplicatenextgenerationsequencingexperiments AT liuchanggong reproducibilityofvariantcallsinreplicatenextgenerationsequencingexperiments AT wangbailing reproducibilityofvariantcallsinreplicatenextgenerationsequencingexperiments AT hesskennethr reproducibilityofvariantcallsinreplicatenextgenerationsequencingexperiments AT symmanswfraser reproducibilityofvariantcallsinreplicatenextgenerationsequencingexperiments AT shiweiwei reproducibilityofvariantcallsinreplicatenextgenerationsequencingexperiments AT pusztailajos reproducibilityofvariantcallsinreplicatenextgenerationsequencingexperiments |