Cargando…

The GATK joint genotyping workflow is appropriate for calling variants in RNA-seq experiments

The Genome Analysis Toolkit (GATK) is a popular set of programs for discovering and genotyping variants from next-generation sequencing data. The current GATK recommendation for RNA sequencing (RNA-seq) is to perform variant calling from individual samples, with the drawback that only variable posit...

Descripción completa

Detalles Bibliográficos
Autores principales: Brouard, Jean-Simon, Schenkel, Flavio, Marete, Andrew, Bissonnette, Nathalie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6587293/
https://www.ncbi.nlm.nih.gov/pubmed/31249686
http://dx.doi.org/10.1186/s40104-019-0359-0
_version_ 1783429039211413504
author Brouard, Jean-Simon
Schenkel, Flavio
Marete, Andrew
Bissonnette, Nathalie
author_facet Brouard, Jean-Simon
Schenkel, Flavio
Marete, Andrew
Bissonnette, Nathalie
author_sort Brouard, Jean-Simon
collection PubMed
description The Genome Analysis Toolkit (GATK) is a popular set of programs for discovering and genotyping variants from next-generation sequencing data. The current GATK recommendation for RNA sequencing (RNA-seq) is to perform variant calling from individual samples, with the drawback that only variable positions are reported. Versions 3.0 and above of GATK offer the possibility of calling DNA variants on cohorts of samples using the HaplotypeCaller algorithm in Genomic Variant Call Format (GVCF) mode. Using this approach, variants are called individually on each sample, generating one GVCF file per sample that lists genotype likelihoods and their genome annotations. In a second step, variants are called from the GVCF files through a joint genotyping analysis. This strategy is more flexible and reduces computational challenges in comparison to the traditional joint discovery workflow. Using a GVCF workflow for mining SNP in RNA-seq data provides substantial advantages, including reporting homozygous genotypes for the reference allele as well as missing data. Taking advantage of RNA-seq data derived from primary macrophages isolated from 50 cows, the GATK joint genotyping method for calling variants on RNA-seq data was validated by comparing this approach to a so-called “per-sample” method. In addition, pair-wise comparisons of the two methods were performed to evaluate their respective sensitivity, precision and accuracy using DNA genotypes from a companion study including the same 50 cows genotyped using either genotyping-by-sequencing or with the Bovine SNP50 Beadchip (imputed to the Bovine high density). Results indicate that both approaches are very close in their capacity of detecting reference variants and that the joint genotyping method is more sensitive than the per-sample method. Given that the joint genotyping method is more flexible and technically easier, we recommend this approach for variant calling in RNA-seq experiments. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s40104-019-0359-0) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6587293
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-65872932019-06-27 The GATK joint genotyping workflow is appropriate for calling variants in RNA-seq experiments Brouard, Jean-Simon Schenkel, Flavio Marete, Andrew Bissonnette, Nathalie J Anim Sci Biotechnol Short Report The Genome Analysis Toolkit (GATK) is a popular set of programs for discovering and genotyping variants from next-generation sequencing data. The current GATK recommendation for RNA sequencing (RNA-seq) is to perform variant calling from individual samples, with the drawback that only variable positions are reported. Versions 3.0 and above of GATK offer the possibility of calling DNA variants on cohorts of samples using the HaplotypeCaller algorithm in Genomic Variant Call Format (GVCF) mode. Using this approach, variants are called individually on each sample, generating one GVCF file per sample that lists genotype likelihoods and their genome annotations. In a second step, variants are called from the GVCF files through a joint genotyping analysis. This strategy is more flexible and reduces computational challenges in comparison to the traditional joint discovery workflow. Using a GVCF workflow for mining SNP in RNA-seq data provides substantial advantages, including reporting homozygous genotypes for the reference allele as well as missing data. Taking advantage of RNA-seq data derived from primary macrophages isolated from 50 cows, the GATK joint genotyping method for calling variants on RNA-seq data was validated by comparing this approach to a so-called “per-sample” method. In addition, pair-wise comparisons of the two methods were performed to evaluate their respective sensitivity, precision and accuracy using DNA genotypes from a companion study including the same 50 cows genotyped using either genotyping-by-sequencing or with the Bovine SNP50 Beadchip (imputed to the Bovine high density). Results indicate that both approaches are very close in their capacity of detecting reference variants and that the joint genotyping method is more sensitive than the per-sample method. Given that the joint genotyping method is more flexible and technically easier, we recommend this approach for variant calling in RNA-seq experiments. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s40104-019-0359-0) contains supplementary material, which is available to authorized users. BioMed Central 2019-06-21 /pmc/articles/PMC6587293/ /pubmed/31249686 http://dx.doi.org/10.1186/s40104-019-0359-0 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Short Report
Brouard, Jean-Simon
Schenkel, Flavio
Marete, Andrew
Bissonnette, Nathalie
The GATK joint genotyping workflow is appropriate for calling variants in RNA-seq experiments
title The GATK joint genotyping workflow is appropriate for calling variants in RNA-seq experiments
title_full The GATK joint genotyping workflow is appropriate for calling variants in RNA-seq experiments
title_fullStr The GATK joint genotyping workflow is appropriate for calling variants in RNA-seq experiments
title_full_unstemmed The GATK joint genotyping workflow is appropriate for calling variants in RNA-seq experiments
title_short The GATK joint genotyping workflow is appropriate for calling variants in RNA-seq experiments
title_sort gatk joint genotyping workflow is appropriate for calling variants in rna-seq experiments
topic Short Report
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6587293/
https://www.ncbi.nlm.nih.gov/pubmed/31249686
http://dx.doi.org/10.1186/s40104-019-0359-0
work_keys_str_mv AT brouardjeansimon thegatkjointgenotypingworkflowisappropriateforcallingvariantsinrnaseqexperiments
AT schenkelflavio thegatkjointgenotypingworkflowisappropriateforcallingvariantsinrnaseqexperiments
AT mareteandrew thegatkjointgenotypingworkflowisappropriateforcallingvariantsinrnaseqexperiments
AT bissonnettenathalie thegatkjointgenotypingworkflowisappropriateforcallingvariantsinrnaseqexperiments
AT brouardjeansimon gatkjointgenotypingworkflowisappropriateforcallingvariantsinrnaseqexperiments
AT schenkelflavio gatkjointgenotypingworkflowisappropriateforcallingvariantsinrnaseqexperiments
AT mareteandrew gatkjointgenotypingworkflowisappropriateforcallingvariantsinrnaseqexperiments
AT bissonnettenathalie gatkjointgenotypingworkflowisappropriateforcallingvariantsinrnaseqexperiments