Cargando…

The importance of genotype identity, genetic heterogeneity, and bioinformatic handling for properly assessing genomic variation in transgenic plants

BACKGROUND: The advent of –omics technologies has enabled the resolution of fine molecular differences among individuals within a species. DNA sequence variations, such as single nucleotide polymorphisms or small deletions, can be tabulated for many kinds of genotype comparisons. However, experiment...

Descripción completa

Detalles Bibliográficos
Autores principales: Michno, Jean-Michel, Stupar, Robert M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5984819/
https://www.ncbi.nlm.nih.gov/pubmed/29859067
http://dx.doi.org/10.1186/s12896-018-0447-9
_version_ 1783328672258719744
author Michno, Jean-Michel
Stupar, Robert M.
author_facet Michno, Jean-Michel
Stupar, Robert M.
author_sort Michno, Jean-Michel
collection PubMed
description BACKGROUND: The advent of –omics technologies has enabled the resolution of fine molecular differences among individuals within a species. DNA sequence variations, such as single nucleotide polymorphisms or small deletions, can be tabulated for many kinds of genotype comparisons. However, experimental designs and analytical approaches are replete with ways to overestimate the level of variation present within a given sample. Analytical pipelines that do not apply proper thresholds nor assess reproducibility among samples are susceptible to calling false-positive variants. Furthermore, issues with sample genotype identity or failing to account for heterogeneity in reference genotypes may lead to misinterpretations of standing variants as polymorphisms derived de novo. RESULTS: A recent publication that featured the analysis of RNA-sequencing data in three transgenic soybean event series appeared to overestimate the number of sequence variants identified in plants that were exposed to a tissue culture based transformation process. We reanalyzed these data with a stringent set of criteria and demonstrate three different factors that lead to variant overestimation, including issues related to the genetic identity of the background genotype, unaccounted genetic heterogeneity in the reference genome, and insufficient bioinformatics filtering. CONCLUSIONS: This study serves as a cautionary tale to users of genomic and transcriptomic data that wish to assess the molecular variation attributable to tissue culture and transformation processes. Moreover, accounting for the factors that lead to sequence variant overestimation is equally applicable to samples derived from other germplasm sources, including chemical or irradiation mutagenesis and genome engineering (e.g., CRISPR) processes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12896-018-0447-9) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5984819
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-59848192018-06-07 The importance of genotype identity, genetic heterogeneity, and bioinformatic handling for properly assessing genomic variation in transgenic plants Michno, Jean-Michel Stupar, Robert M. BMC Biotechnol Research Article BACKGROUND: The advent of –omics technologies has enabled the resolution of fine molecular differences among individuals within a species. DNA sequence variations, such as single nucleotide polymorphisms or small deletions, can be tabulated for many kinds of genotype comparisons. However, experimental designs and analytical approaches are replete with ways to overestimate the level of variation present within a given sample. Analytical pipelines that do not apply proper thresholds nor assess reproducibility among samples are susceptible to calling false-positive variants. Furthermore, issues with sample genotype identity or failing to account for heterogeneity in reference genotypes may lead to misinterpretations of standing variants as polymorphisms derived de novo. RESULTS: A recent publication that featured the analysis of RNA-sequencing data in three transgenic soybean event series appeared to overestimate the number of sequence variants identified in plants that were exposed to a tissue culture based transformation process. We reanalyzed these data with a stringent set of criteria and demonstrate three different factors that lead to variant overestimation, including issues related to the genetic identity of the background genotype, unaccounted genetic heterogeneity in the reference genome, and insufficient bioinformatics filtering. CONCLUSIONS: This study serves as a cautionary tale to users of genomic and transcriptomic data that wish to assess the molecular variation attributable to tissue culture and transformation processes. Moreover, accounting for the factors that lead to sequence variant overestimation is equally applicable to samples derived from other germplasm sources, including chemical or irradiation mutagenesis and genome engineering (e.g., CRISPR) processes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12896-018-0447-9) contains supplementary material, which is available to authorized users. BioMed Central 2018-06-01 /pmc/articles/PMC5984819/ /pubmed/29859067 http://dx.doi.org/10.1186/s12896-018-0447-9 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Michno, Jean-Michel
Stupar, Robert M.
The importance of genotype identity, genetic heterogeneity, and bioinformatic handling for properly assessing genomic variation in transgenic plants
title The importance of genotype identity, genetic heterogeneity, and bioinformatic handling for properly assessing genomic variation in transgenic plants
title_full The importance of genotype identity, genetic heterogeneity, and bioinformatic handling for properly assessing genomic variation in transgenic plants
title_fullStr The importance of genotype identity, genetic heterogeneity, and bioinformatic handling for properly assessing genomic variation in transgenic plants
title_full_unstemmed The importance of genotype identity, genetic heterogeneity, and bioinformatic handling for properly assessing genomic variation in transgenic plants
title_short The importance of genotype identity, genetic heterogeneity, and bioinformatic handling for properly assessing genomic variation in transgenic plants
title_sort importance of genotype identity, genetic heterogeneity, and bioinformatic handling for properly assessing genomic variation in transgenic plants
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5984819/
https://www.ncbi.nlm.nih.gov/pubmed/29859067
http://dx.doi.org/10.1186/s12896-018-0447-9
work_keys_str_mv AT michnojeanmichel theimportanceofgenotypeidentitygeneticheterogeneityandbioinformatichandlingforproperlyassessinggenomicvariationintransgenicplants
AT stuparrobertm theimportanceofgenotypeidentitygeneticheterogeneityandbioinformatichandlingforproperlyassessinggenomicvariationintransgenicplants
AT michnojeanmichel importanceofgenotypeidentitygeneticheterogeneityandbioinformatichandlingforproperlyassessinggenomicvariationintransgenicplants
AT stuparrobertm importanceofgenotypeidentitygeneticheterogeneityandbioinformatichandlingforproperlyassessinggenomicvariationintransgenicplants