Cargando…
Leveraging histone modifications to improve genome annotations
Accurate genome annotations are essential to modern biology; however, they remain challenging to produce. Variation in gene structure and expression across species, as well as within an organism, make correctly annotating genes arduous; an issue exacerbated by pitfalls in current in silico methods....
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8473982/ https://www.ncbi.nlm.nih.gov/pubmed/34568920 http://dx.doi.org/10.1093/g3journal/jkab263 |
_version_ | 1784575119256977408 |
---|---|
author | Mendieta, John Pablo Marand, Alexandre P Ricci, William A Zhang, Xuan Schmitz, Robert J |
author_facet | Mendieta, John Pablo Marand, Alexandre P Ricci, William A Zhang, Xuan Schmitz, Robert J |
author_sort | Mendieta, John Pablo |
collection | PubMed |
description | Accurate genome annotations are essential to modern biology; however, they remain challenging to produce. Variation in gene structure and expression across species, as well as within an organism, make correctly annotating genes arduous; an issue exacerbated by pitfalls in current in silico methods. These issues necessitate complementary approaches to add additional confidence and rectify potential misannotations. Integration of epigenomic data into genome annotation is one such approach. In this study, we utilized sets of histone modification data, which are precisely distributed at either gene bodies or promoters to evaluate the annotation of the Zea mays genome. We leveraged these data genome wide, allowing for identification of annotations discordant with empirical data. In total, 13,159 annotation discrepancies were found in Z. mays upon integrating data across three different tissues, which were corroborated using RNA-based approaches. Upon correction, genes were extended by an average of 2128 base pairs, and we identified 2529 novel genes. Application of this method to five additional plant genomes identified a series of misannotations, as well as identified novel genes, including 13,836 in Asparagus officinalis, 2724 in Setaria viridis, 2446 in Sorghum bicolor, 8631 in Glycine max, and 2585 in Phaseolous vulgaris. This study demonstrates that histone modification data can be leveraged to rapidly improve current genome annotations across diverse plant lineages. |
format | Online Article Text |
id | pubmed-8473982 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-84739822021-09-27 Leveraging histone modifications to improve genome annotations Mendieta, John Pablo Marand, Alexandre P Ricci, William A Zhang, Xuan Schmitz, Robert J G3 (Bethesda) Genome Report Accurate genome annotations are essential to modern biology; however, they remain challenging to produce. Variation in gene structure and expression across species, as well as within an organism, make correctly annotating genes arduous; an issue exacerbated by pitfalls in current in silico methods. These issues necessitate complementary approaches to add additional confidence and rectify potential misannotations. Integration of epigenomic data into genome annotation is one such approach. In this study, we utilized sets of histone modification data, which are precisely distributed at either gene bodies or promoters to evaluate the annotation of the Zea mays genome. We leveraged these data genome wide, allowing for identification of annotations discordant with empirical data. In total, 13,159 annotation discrepancies were found in Z. mays upon integrating data across three different tissues, which were corroborated using RNA-based approaches. Upon correction, genes were extended by an average of 2128 base pairs, and we identified 2529 novel genes. Application of this method to five additional plant genomes identified a series of misannotations, as well as identified novel genes, including 13,836 in Asparagus officinalis, 2724 in Setaria viridis, 2446 in Sorghum bicolor, 8631 in Glycine max, and 2585 in Phaseolous vulgaris. This study demonstrates that histone modification data can be leveraged to rapidly improve current genome annotations across diverse plant lineages. Oxford University Press 2021-07-27 /pmc/articles/PMC8473982/ /pubmed/34568920 http://dx.doi.org/10.1093/g3journal/jkab263 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of Genetics Society of America. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs licence (http://creativecommons.org/licenses/by-nc-nd/4.0/ (https://creativecommons.org/licenses/by-nc-nd/4.0/) ), which permits non-commercial reproduction and distribution of the work, in any medium, provided the original work is not altered or transformed in any way, and that the work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Genome Report Mendieta, John Pablo Marand, Alexandre P Ricci, William A Zhang, Xuan Schmitz, Robert J Leveraging histone modifications to improve genome annotations |
title | Leveraging histone modifications to improve genome annotations |
title_full | Leveraging histone modifications to improve genome annotations |
title_fullStr | Leveraging histone modifications to improve genome annotations |
title_full_unstemmed | Leveraging histone modifications to improve genome annotations |
title_short | Leveraging histone modifications to improve genome annotations |
title_sort | leveraging histone modifications to improve genome annotations |
topic | Genome Report |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8473982/ https://www.ncbi.nlm.nih.gov/pubmed/34568920 http://dx.doi.org/10.1093/g3journal/jkab263 |
work_keys_str_mv | AT mendietajohnpablo leveraginghistonemodificationstoimprovegenomeannotations AT marandalexandrep leveraginghistonemodificationstoimprovegenomeannotations AT ricciwilliama leveraginghistonemodificationstoimprovegenomeannotations AT zhangxuan leveraginghistonemodificationstoimprovegenomeannotations AT schmitzrobertj leveraginghistonemodificationstoimprovegenomeannotations |