Cargando…

Multiple Variant Calling Pipelines in Wheat Whole Exome Sequencing

The highly challenging hexaploid wheat (Triticum aestivum) genome is becoming ever more accessible due to the continued development of multiple reference genomes, a factor which aids in the plight to better understand variation in important traits. Although the process of variant calling is relative...

Descripción completa

Detalles Bibliográficos
Autores principales: Cagirici, H. Busra, Akpinar, Bala Ani, Sen, Taner Z., Budak, Hikmet
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8509018/
https://www.ncbi.nlm.nih.gov/pubmed/34638743
http://dx.doi.org/10.3390/ijms221910400
_version_ 1784582235599405056
author Cagirici, H. Busra
Akpinar, Bala Ani
Sen, Taner Z.
Budak, Hikmet
author_facet Cagirici, H. Busra
Akpinar, Bala Ani
Sen, Taner Z.
Budak, Hikmet
author_sort Cagirici, H. Busra
collection PubMed
description The highly challenging hexaploid wheat (Triticum aestivum) genome is becoming ever more accessible due to the continued development of multiple reference genomes, a factor which aids in the plight to better understand variation in important traits. Although the process of variant calling is relatively straightforward, selection of the best combination of the computational tools for read alignment and variant calling stages of the analysis and efficient filtering of the false variant calls are not always easy tasks. Previous studies have analyzed the impact of methods on the quality metrics in diploid organisms. Given that variant identification in wheat largely relies on accurate mining of exome data, there is a critical need to better understand how different methods affect the analysis of whole exome sequencing (WES) data in polyploid species. This study aims to address this by performing whole exome sequencing of 48 wheat cultivars and assessing the performance of various variant calling pipelines at their suggested settings. The results show that all the pipelines require filtering to eliminate false-positive calls. The high consensus among the reference SNPs called by the best-performing pipelines suggests that filtering provides accurate and reproducible results. This study also provides detailed comparisons for high sensitivity and precision at individual and population levels for the raw and filtered SNP calls.
format Online
Article
Text
id pubmed-8509018
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-85090182021-10-13 Multiple Variant Calling Pipelines in Wheat Whole Exome Sequencing Cagirici, H. Busra Akpinar, Bala Ani Sen, Taner Z. Budak, Hikmet Int J Mol Sci Article The highly challenging hexaploid wheat (Triticum aestivum) genome is becoming ever more accessible due to the continued development of multiple reference genomes, a factor which aids in the plight to better understand variation in important traits. Although the process of variant calling is relatively straightforward, selection of the best combination of the computational tools for read alignment and variant calling stages of the analysis and efficient filtering of the false variant calls are not always easy tasks. Previous studies have analyzed the impact of methods on the quality metrics in diploid organisms. Given that variant identification in wheat largely relies on accurate mining of exome data, there is a critical need to better understand how different methods affect the analysis of whole exome sequencing (WES) data in polyploid species. This study aims to address this by performing whole exome sequencing of 48 wheat cultivars and assessing the performance of various variant calling pipelines at their suggested settings. The results show that all the pipelines require filtering to eliminate false-positive calls. The high consensus among the reference SNPs called by the best-performing pipelines suggests that filtering provides accurate and reproducible results. This study also provides detailed comparisons for high sensitivity and precision at individual and population levels for the raw and filtered SNP calls. MDPI 2021-09-27 /pmc/articles/PMC8509018/ /pubmed/34638743 http://dx.doi.org/10.3390/ijms221910400 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Cagirici, H. Busra
Akpinar, Bala Ani
Sen, Taner Z.
Budak, Hikmet
Multiple Variant Calling Pipelines in Wheat Whole Exome Sequencing
title Multiple Variant Calling Pipelines in Wheat Whole Exome Sequencing
title_full Multiple Variant Calling Pipelines in Wheat Whole Exome Sequencing
title_fullStr Multiple Variant Calling Pipelines in Wheat Whole Exome Sequencing
title_full_unstemmed Multiple Variant Calling Pipelines in Wheat Whole Exome Sequencing
title_short Multiple Variant Calling Pipelines in Wheat Whole Exome Sequencing
title_sort multiple variant calling pipelines in wheat whole exome sequencing
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8509018/
https://www.ncbi.nlm.nih.gov/pubmed/34638743
http://dx.doi.org/10.3390/ijms221910400
work_keys_str_mv AT cagiricihbusra multiplevariantcallingpipelinesinwheatwholeexomesequencing
AT akpinarbalaani multiplevariantcallingpipelinesinwheatwholeexomesequencing
AT sentanerz multiplevariantcallingpipelinesinwheatwholeexomesequencing
AT budakhikmet multiplevariantcallingpipelinesinwheatwholeexomesequencing