Cargando…

Leveraging histone modifications to improve genome annotations

Accurate genome annotations are essential to modern biology; however, they remain challenging to produce. Variation in gene structure and expression across species, as well as within an organism, make correctly annotating genes arduous; an issue exacerbated by pitfalls in current in silico methods....

Descripción completa

Detalles Bibliográficos
Autores principales: Mendieta, John Pablo, Marand, Alexandre P, Ricci, William A, Zhang, Xuan, Schmitz, Robert J
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8473982/
https://www.ncbi.nlm.nih.gov/pubmed/34568920
http://dx.doi.org/10.1093/g3journal/jkab263
_version_ 1784575119256977408
author Mendieta, John Pablo
Marand, Alexandre P
Ricci, William A
Zhang, Xuan
Schmitz, Robert J
author_facet Mendieta, John Pablo
Marand, Alexandre P
Ricci, William A
Zhang, Xuan
Schmitz, Robert J
author_sort Mendieta, John Pablo
collection PubMed
description Accurate genome annotations are essential to modern biology; however, they remain challenging to produce. Variation in gene structure and expression across species, as well as within an organism, make correctly annotating genes arduous; an issue exacerbated by pitfalls in current in silico methods. These issues necessitate complementary approaches to add additional confidence and rectify potential misannotations. Integration of epigenomic data into genome annotation is one such approach. In this study, we utilized sets of histone modification data, which are precisely distributed at either gene bodies or promoters to evaluate the annotation of the Zea mays genome. We leveraged these data genome wide, allowing for identification of annotations discordant with empirical data. In total, 13,159 annotation discrepancies were found in Z. mays upon integrating data across three different tissues, which were corroborated using RNA-based approaches. Upon correction, genes were extended by an average of 2128 base pairs, and we identified 2529 novel genes. Application of this method to five additional plant genomes identified a series of misannotations, as well as identified novel genes, including 13,836 in Asparagus officinalis, 2724 in Setaria viridis, 2446 in Sorghum bicolor, 8631 in Glycine max, and 2585 in Phaseolous vulgaris. This study demonstrates that histone modification data can be leveraged to rapidly improve current genome annotations across diverse plant lineages.
format Online
Article
Text
id pubmed-8473982
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-84739822021-09-27 Leveraging histone modifications to improve genome annotations Mendieta, John Pablo Marand, Alexandre P Ricci, William A Zhang, Xuan Schmitz, Robert J G3 (Bethesda) Genome Report Accurate genome annotations are essential to modern biology; however, they remain challenging to produce. Variation in gene structure and expression across species, as well as within an organism, make correctly annotating genes arduous; an issue exacerbated by pitfalls in current in silico methods. These issues necessitate complementary approaches to add additional confidence and rectify potential misannotations. Integration of epigenomic data into genome annotation is one such approach. In this study, we utilized sets of histone modification data, which are precisely distributed at either gene bodies or promoters to evaluate the annotation of the Zea mays genome. We leveraged these data genome wide, allowing for identification of annotations discordant with empirical data. In total, 13,159 annotation discrepancies were found in Z. mays upon integrating data across three different tissues, which were corroborated using RNA-based approaches. Upon correction, genes were extended by an average of 2128 base pairs, and we identified 2529 novel genes. Application of this method to five additional plant genomes identified a series of misannotations, as well as identified novel genes, including 13,836 in Asparagus officinalis, 2724 in Setaria viridis, 2446 in Sorghum bicolor, 8631 in Glycine max, and 2585 in Phaseolous vulgaris. This study demonstrates that histone modification data can be leveraged to rapidly improve current genome annotations across diverse plant lineages. Oxford University Press 2021-07-27 /pmc/articles/PMC8473982/ /pubmed/34568920 http://dx.doi.org/10.1093/g3journal/jkab263 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of Genetics Society of America. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs licence (http://creativecommons.org/licenses/by-nc-nd/4.0/ (https://creativecommons.org/licenses/by-nc-nd/4.0/) ), which permits non-commercial reproduction and distribution of the work, in any medium, provided the original work is not altered or transformed in any way, and that the work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Genome Report
Mendieta, John Pablo
Marand, Alexandre P
Ricci, William A
Zhang, Xuan
Schmitz, Robert J
Leveraging histone modifications to improve genome annotations
title Leveraging histone modifications to improve genome annotations
title_full Leveraging histone modifications to improve genome annotations
title_fullStr Leveraging histone modifications to improve genome annotations
title_full_unstemmed Leveraging histone modifications to improve genome annotations
title_short Leveraging histone modifications to improve genome annotations
title_sort leveraging histone modifications to improve genome annotations
topic Genome Report
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8473982/
https://www.ncbi.nlm.nih.gov/pubmed/34568920
http://dx.doi.org/10.1093/g3journal/jkab263
work_keys_str_mv AT mendietajohnpablo leveraginghistonemodificationstoimprovegenomeannotations
AT marandalexandrep leveraginghistonemodificationstoimprovegenomeannotations
AT ricciwilliama leveraginghistonemodificationstoimprovegenomeannotations
AT zhangxuan leveraginghistonemodificationstoimprovegenomeannotations
AT schmitzrobertj leveraginghistonemodificationstoimprovegenomeannotations