Cargando…
Performance assessment of variant calling pipelines using human whole exome sequencing and simulated data
BACKGROUND: Whole exome sequencing (WES) is a cost-effective method that identifies clinical variants but it demands accurate variant caller tools. Currently available tools have variable accuracy in predicting specific clinical variants. But it may be possible to find the best combination of aligne...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6580603/ https://www.ncbi.nlm.nih.gov/pubmed/31208315 http://dx.doi.org/10.1186/s12859-019-2928-9 |
_version_ | 1783428053009956864 |
---|---|
author | Kumaran, Manojkumar Subramanian, Umadevi Devarajan, Bharanidharan |
author_facet | Kumaran, Manojkumar Subramanian, Umadevi Devarajan, Bharanidharan |
author_sort | Kumaran, Manojkumar |
collection | PubMed |
description | BACKGROUND: Whole exome sequencing (WES) is a cost-effective method that identifies clinical variants but it demands accurate variant caller tools. Currently available tools have variable accuracy in predicting specific clinical variants. But it may be possible to find the best combination of aligner-variant caller tools for detecting accurate single nucleotide variants (SNVs) and small insertion and deletion (InDels) separately. Moreover, many important aspects of InDel detection are overlooked while comparing the performance of tools, particularly its base pair length. RESULTS: We assessed the performance of variant calling pipelines using the combinations of four variant callers and five aligners on human NA12878 and simulated exome data. We used high confidence variant calls from Genome in a Bottle (GiaB) consortium for validation, and GRCh37 and GRCh38 as the human reference genome. Based on the performance metrics, both BWA and Novoalign aligners performed better with DeepVariant and SAMtools callers for detecting SNVs, and with DeepVariant and GATK for InDels. Furthermore, we obtained similar results on human NA24385 and NA24631 exome data from GiaB. CONCLUSION: In this study, DeepVariant with BWA and Novoalign performed best for detecting accurate SNVs and InDels. The accuracy of variant calling was improved by merging the top performing pipelines. The results of our study provide useful recommendations for analysis of WES data in clinical genomics. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2928-9) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-6580603 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-65806032019-06-24 Performance assessment of variant calling pipelines using human whole exome sequencing and simulated data Kumaran, Manojkumar Subramanian, Umadevi Devarajan, Bharanidharan BMC Bioinformatics Research Article BACKGROUND: Whole exome sequencing (WES) is a cost-effective method that identifies clinical variants but it demands accurate variant caller tools. Currently available tools have variable accuracy in predicting specific clinical variants. But it may be possible to find the best combination of aligner-variant caller tools for detecting accurate single nucleotide variants (SNVs) and small insertion and deletion (InDels) separately. Moreover, many important aspects of InDel detection are overlooked while comparing the performance of tools, particularly its base pair length. RESULTS: We assessed the performance of variant calling pipelines using the combinations of four variant callers and five aligners on human NA12878 and simulated exome data. We used high confidence variant calls from Genome in a Bottle (GiaB) consortium for validation, and GRCh37 and GRCh38 as the human reference genome. Based on the performance metrics, both BWA and Novoalign aligners performed better with DeepVariant and SAMtools callers for detecting SNVs, and with DeepVariant and GATK for InDels. Furthermore, we obtained similar results on human NA24385 and NA24631 exome data from GiaB. CONCLUSION: In this study, DeepVariant with BWA and Novoalign performed best for detecting accurate SNVs and InDels. The accuracy of variant calling was improved by merging the top performing pipelines. The results of our study provide useful recommendations for analysis of WES data in clinical genomics. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2928-9) contains supplementary material, which is available to authorized users. BioMed Central 2019-06-17 /pmc/articles/PMC6580603/ /pubmed/31208315 http://dx.doi.org/10.1186/s12859-019-2928-9 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Kumaran, Manojkumar Subramanian, Umadevi Devarajan, Bharanidharan Performance assessment of variant calling pipelines using human whole exome sequencing and simulated data |
title | Performance assessment of variant calling pipelines using human whole exome sequencing and simulated data |
title_full | Performance assessment of variant calling pipelines using human whole exome sequencing and simulated data |
title_fullStr | Performance assessment of variant calling pipelines using human whole exome sequencing and simulated data |
title_full_unstemmed | Performance assessment of variant calling pipelines using human whole exome sequencing and simulated data |
title_short | Performance assessment of variant calling pipelines using human whole exome sequencing and simulated data |
title_sort | performance assessment of variant calling pipelines using human whole exome sequencing and simulated data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6580603/ https://www.ncbi.nlm.nih.gov/pubmed/31208315 http://dx.doi.org/10.1186/s12859-019-2928-9 |
work_keys_str_mv | AT kumaranmanojkumar performanceassessmentofvariantcallingpipelinesusinghumanwholeexomesequencingandsimulateddata AT subramanianumadevi performanceassessmentofvariantcallingpipelinesusinghumanwholeexomesequencingandsimulateddata AT devarajanbharanidharan performanceassessmentofvariantcallingpipelinesusinghumanwholeexomesequencingandsimulateddata |