Cargando…

Correction of gene model annotations improves isoform abundance estimates: the example of ketohexokinase ( Khk)

Next generation sequencing protocols such as RNA-seq have made the genome-wide characterization of the transcriptome a crucial part of many research projects in biology. Analyses of the resulting data provide key information on gene expression and in certain cases on exon or isoform usage. The emerg...

Descripción completa

Detalles Bibliográficos
Autores principales: Chabbert, Christophe D., Eberhart, Tanja, Guccini, Ilaria, Krek, Wilhelm, Kovacs, Werner J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: F1000 Research Limited 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6464065/
https://www.ncbi.nlm.nih.gov/pubmed/31001414
http://dx.doi.org/10.12688/f1000research.17082.2
_version_ 1783410828059344896
author Chabbert, Christophe D.
Eberhart, Tanja
Guccini, Ilaria
Krek, Wilhelm
Kovacs, Werner J.
author_facet Chabbert, Christophe D.
Eberhart, Tanja
Guccini, Ilaria
Krek, Wilhelm
Kovacs, Werner J.
author_sort Chabbert, Christophe D.
collection PubMed
description Next generation sequencing protocols such as RNA-seq have made the genome-wide characterization of the transcriptome a crucial part of many research projects in biology. Analyses of the resulting data provide key information on gene expression and in certain cases on exon or isoform usage. The emergence of transcript quantification software such as Salmon has enabled researchers to efficiently estimate isoform and gene expressions across the genome while tremendously reducing the necessary computational power. Although overall gene expression estimations were shown to be accurate, isoform expression quantifications appear to be a more challenging task. Low expression levels and uneven or insufficient coverage were reported as potential explanations for inconsistent estimates. Here, through the example of the ketohexokinase ( Khk) gene in mouse, we demonstrate that the use of an incorrect gene annotation can also result in erroneous isoform quantification results. Manual correction of the input Khk gene model provided a much more accurate estimation of relative Khk isoform expression when compared to quantitative PCR (qPCR measurements). In particular, removal of an unexpressed retained intron and a proper adjustment of the 5’ and 3’ untranslated regions both had a strong impact on the correction of erroneous estimates. Finally, we observed a better concordance in isoform quantification between datasets and sequencing strategies when relying on the newly generated Khk annotations. These results highlight the importance of accurate gene models and annotations for correct isoform quantification and reassert the need for orthogonal methods of estimation of isoform expression to confirm important findings.
format Online
Article
Text
id pubmed-6464065
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher F1000 Research Limited
record_format MEDLINE/PubMed
spelling pubmed-64640652019-04-17 Correction of gene model annotations improves isoform abundance estimates: the example of ketohexokinase ( Khk) Chabbert, Christophe D. Eberhart, Tanja Guccini, Ilaria Krek, Wilhelm Kovacs, Werner J. F1000Res Research Article Next generation sequencing protocols such as RNA-seq have made the genome-wide characterization of the transcriptome a crucial part of many research projects in biology. Analyses of the resulting data provide key information on gene expression and in certain cases on exon or isoform usage. The emergence of transcript quantification software such as Salmon has enabled researchers to efficiently estimate isoform and gene expressions across the genome while tremendously reducing the necessary computational power. Although overall gene expression estimations were shown to be accurate, isoform expression quantifications appear to be a more challenging task. Low expression levels and uneven or insufficient coverage were reported as potential explanations for inconsistent estimates. Here, through the example of the ketohexokinase ( Khk) gene in mouse, we demonstrate that the use of an incorrect gene annotation can also result in erroneous isoform quantification results. Manual correction of the input Khk gene model provided a much more accurate estimation of relative Khk isoform expression when compared to quantitative PCR (qPCR measurements). In particular, removal of an unexpressed retained intron and a proper adjustment of the 5’ and 3’ untranslated regions both had a strong impact on the correction of erroneous estimates. Finally, we observed a better concordance in isoform quantification between datasets and sequencing strategies when relying on the newly generated Khk annotations. These results highlight the importance of accurate gene models and annotations for correct isoform quantification and reassert the need for orthogonal methods of estimation of isoform expression to confirm important findings. F1000 Research Limited 2019-04-03 /pmc/articles/PMC6464065/ /pubmed/31001414 http://dx.doi.org/10.12688/f1000research.17082.2 Text en Copyright: © 2019 Chabbert CD et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Chabbert, Christophe D.
Eberhart, Tanja
Guccini, Ilaria
Krek, Wilhelm
Kovacs, Werner J.
Correction of gene model annotations improves isoform abundance estimates: the example of ketohexokinase ( Khk)
title Correction of gene model annotations improves isoform abundance estimates: the example of ketohexokinase ( Khk)
title_full Correction of gene model annotations improves isoform abundance estimates: the example of ketohexokinase ( Khk)
title_fullStr Correction of gene model annotations improves isoform abundance estimates: the example of ketohexokinase ( Khk)
title_full_unstemmed Correction of gene model annotations improves isoform abundance estimates: the example of ketohexokinase ( Khk)
title_short Correction of gene model annotations improves isoform abundance estimates: the example of ketohexokinase ( Khk)
title_sort correction of gene model annotations improves isoform abundance estimates: the example of ketohexokinase ( khk)
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6464065/
https://www.ncbi.nlm.nih.gov/pubmed/31001414
http://dx.doi.org/10.12688/f1000research.17082.2
work_keys_str_mv AT chabbertchristophed correctionofgenemodelannotationsimprovesisoformabundanceestimatestheexampleofketohexokinasekhk
AT eberharttanja correctionofgenemodelannotationsimprovesisoformabundanceestimatestheexampleofketohexokinasekhk
AT gucciniilaria correctionofgenemodelannotationsimprovesisoformabundanceestimatestheexampleofketohexokinasekhk
AT krekwilhelm correctionofgenemodelannotationsimprovesisoformabundanceestimatestheexampleofketohexokinasekhk
AT kovacswernerj correctionofgenemodelannotationsimprovesisoformabundanceestimatestheexampleofketohexokinasekhk