Cargando…
Comprehensive Analysis to Improve the Validation Rate for Single Nucleotide Variants Detected by Next-Generation Sequencing
Next-generation sequencing (NGS) has enabled the high-throughput discovery of germline and somatic mutations. However, NGS-based variant detection is still prone to errors, resulting in inaccurate variant calls. Here, we categorized the variants detected by NGS according to total read depth (TD) and...
Autores principales: | , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3906084/ https://www.ncbi.nlm.nih.gov/pubmed/24489763 http://dx.doi.org/10.1371/journal.pone.0086664 |
_version_ | 1782301435146272768 |
---|---|
author | Park, Mi-Hyun Rhee, Hwanseok Park, Jung Hoon Woo, Hae-Mi Choi, Byung-Ok Kim, Bo-Young Chung, Ki Wha Cho, Yoo-Bok Kim, Hyung Jin Jung, Ji-Won Koo, Soo Kyung |
author_facet | Park, Mi-Hyun Rhee, Hwanseok Park, Jung Hoon Woo, Hae-Mi Choi, Byung-Ok Kim, Bo-Young Chung, Ki Wha Cho, Yoo-Bok Kim, Hyung Jin Jung, Ji-Won Koo, Soo Kyung |
author_sort | Park, Mi-Hyun |
collection | PubMed |
description | Next-generation sequencing (NGS) has enabled the high-throughput discovery of germline and somatic mutations. However, NGS-based variant detection is still prone to errors, resulting in inaccurate variant calls. Here, we categorized the variants detected by NGS according to total read depth (TD) and SNP quality (SNPQ), and performed Sanger sequencing with 348 selected non-synonymous single nucleotide variants (SNVs) for validation. Using the SAMtools and GATK algorithms, the validation rate was positively correlated with SNPQ but showed no correlation with TD. In addition, common variants called by both programs had a higher validation rate than caller-specific variants. We further examined several parameters to improve the validation rate, and found that strand bias (SB) was a key parameter. SB in NGS data showed a strong difference between the variants passing validation and those that failed validation, showing a validation rate of more than 92% (filtering cutoff value: alternate allele forward [AF]≥20 and AF<80 in SAMtools, SB<–10 in GATK). Moreover, the validation rate increased significantly (up to 97–99%) when the variant was filtered together with the suggested values of mapping quality (MQ), SNPQ and SB. This detailed and systematic study provides comprehensive recommendations for improving validation rates, saving time and lowering cost in NGS analyses. |
format | Online Article Text |
id | pubmed-3906084 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-39060842014-01-31 Comprehensive Analysis to Improve the Validation Rate for Single Nucleotide Variants Detected by Next-Generation Sequencing Park, Mi-Hyun Rhee, Hwanseok Park, Jung Hoon Woo, Hae-Mi Choi, Byung-Ok Kim, Bo-Young Chung, Ki Wha Cho, Yoo-Bok Kim, Hyung Jin Jung, Ji-Won Koo, Soo Kyung PLoS One Research Article Next-generation sequencing (NGS) has enabled the high-throughput discovery of germline and somatic mutations. However, NGS-based variant detection is still prone to errors, resulting in inaccurate variant calls. Here, we categorized the variants detected by NGS according to total read depth (TD) and SNP quality (SNPQ), and performed Sanger sequencing with 348 selected non-synonymous single nucleotide variants (SNVs) for validation. Using the SAMtools and GATK algorithms, the validation rate was positively correlated with SNPQ but showed no correlation with TD. In addition, common variants called by both programs had a higher validation rate than caller-specific variants. We further examined several parameters to improve the validation rate, and found that strand bias (SB) was a key parameter. SB in NGS data showed a strong difference between the variants passing validation and those that failed validation, showing a validation rate of more than 92% (filtering cutoff value: alternate allele forward [AF]≥20 and AF<80 in SAMtools, SB<–10 in GATK). Moreover, the validation rate increased significantly (up to 97–99%) when the variant was filtered together with the suggested values of mapping quality (MQ), SNPQ and SB. This detailed and systematic study provides comprehensive recommendations for improving validation rates, saving time and lowering cost in NGS analyses. Public Library of Science 2014-01-29 /pmc/articles/PMC3906084/ /pubmed/24489763 http://dx.doi.org/10.1371/journal.pone.0086664 Text en © 2014 Park et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Park, Mi-Hyun Rhee, Hwanseok Park, Jung Hoon Woo, Hae-Mi Choi, Byung-Ok Kim, Bo-Young Chung, Ki Wha Cho, Yoo-Bok Kim, Hyung Jin Jung, Ji-Won Koo, Soo Kyung Comprehensive Analysis to Improve the Validation Rate for Single Nucleotide Variants Detected by Next-Generation Sequencing |
title | Comprehensive Analysis to Improve the Validation Rate for Single Nucleotide Variants Detected by Next-Generation Sequencing |
title_full | Comprehensive Analysis to Improve the Validation Rate for Single Nucleotide Variants Detected by Next-Generation Sequencing |
title_fullStr | Comprehensive Analysis to Improve the Validation Rate for Single Nucleotide Variants Detected by Next-Generation Sequencing |
title_full_unstemmed | Comprehensive Analysis to Improve the Validation Rate for Single Nucleotide Variants Detected by Next-Generation Sequencing |
title_short | Comprehensive Analysis to Improve the Validation Rate for Single Nucleotide Variants Detected by Next-Generation Sequencing |
title_sort | comprehensive analysis to improve the validation rate for single nucleotide variants detected by next-generation sequencing |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3906084/ https://www.ncbi.nlm.nih.gov/pubmed/24489763 http://dx.doi.org/10.1371/journal.pone.0086664 |
work_keys_str_mv | AT parkmihyun comprehensiveanalysistoimprovethevalidationrateforsinglenucleotidevariantsdetectedbynextgenerationsequencing AT rheehwanseok comprehensiveanalysistoimprovethevalidationrateforsinglenucleotidevariantsdetectedbynextgenerationsequencing AT parkjunghoon comprehensiveanalysistoimprovethevalidationrateforsinglenucleotidevariantsdetectedbynextgenerationsequencing AT woohaemi comprehensiveanalysistoimprovethevalidationrateforsinglenucleotidevariantsdetectedbynextgenerationsequencing AT choibyungok comprehensiveanalysistoimprovethevalidationrateforsinglenucleotidevariantsdetectedbynextgenerationsequencing AT kimboyoung comprehensiveanalysistoimprovethevalidationrateforsinglenucleotidevariantsdetectedbynextgenerationsequencing AT chungkiwha comprehensiveanalysistoimprovethevalidationrateforsinglenucleotidevariantsdetectedbynextgenerationsequencing AT choyoobok comprehensiveanalysistoimprovethevalidationrateforsinglenucleotidevariantsdetectedbynextgenerationsequencing AT kimhyungjin comprehensiveanalysistoimprovethevalidationrateforsinglenucleotidevariantsdetectedbynextgenerationsequencing AT jungjiwon comprehensiveanalysistoimprovethevalidationrateforsinglenucleotidevariantsdetectedbynextgenerationsequencing AT koosookyung comprehensiveanalysistoimprovethevalidationrateforsinglenucleotidevariantsdetectedbynextgenerationsequencing |