Cargando…

Advances in gene ontology utilization improve statistical power of annotation enrichment

Gene-annotation enrichment is a common method for utilizing ontology-based annotations in gene and gene-product centric knowledgebases. Effective utilization of these annotations requires inferring semantic linkages by tracing paths through edges in the ontological graph, referred to as relations. H...

Descripción completa

Detalles Bibliográficos
Autores principales: Hinderer, Eugene W., Flight, Robert M., Dubey, Rashmi, MacLeod, James N., Moseley, Hunter N. B.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6695228/
https://www.ncbi.nlm.nih.gov/pubmed/31415589
http://dx.doi.org/10.1371/journal.pone.0220728
_version_ 1783443999493718016
author Hinderer, Eugene W.
Flight, Robert M.
Dubey, Rashmi
MacLeod, James N.
Moseley, Hunter N. B.
author_facet Hinderer, Eugene W.
Flight, Robert M.
Dubey, Rashmi
MacLeod, James N.
Moseley, Hunter N. B.
author_sort Hinderer, Eugene W.
collection PubMed
description Gene-annotation enrichment is a common method for utilizing ontology-based annotations in gene and gene-product centric knowledgebases. Effective utilization of these annotations requires inferring semantic linkages by tracing paths through edges in the ontological graph, referred to as relations. However, some relations are semantically problematic with respect to scope, necessitating their omission or modification lest erroneous term mappings occur. To address these issues, we created the Gene Ontology Categorization Suite, or GOcats—a novel tool that organizes the Gene Ontology into subgraphs representing user-defined concepts, while ensuring that all appropriate relations are congruent with respect to scoping semantics. Here, we demonstrate the improvements in annotation enrichment by re-interpreting edges that would otherwise be omitted by traditional ancestor path-tracing methods. Specifically, we show that GOcats’ unique handling of relations improves enrichment over conventional methods in the analysis of two different gene-expression datasets: a breast cancer microarray dataset and several horse cartilage development RNAseq datasets. With the breast cancer microarray dataset, we observed significant improvement (one-sided binomial test p-value = 1.86E-25) in 182 of 217 significantly enriched GO terms identified from the conventional path traversal method when GOcats’ path traversal was used. We also found new significantly enriched terms using GOcats, whose biological relevancy has been experimentally demonstrated elsewhere. Likewise, on the horse RNAseq datasets, we observed a significant improvement in GO term enrichment when using GOcat’s path traversal: one-sided binomial test p-values range from 1.32E-03 to 2.58E-44.
format Online
Article
Text
id pubmed-6695228
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-66952282019-08-30 Advances in gene ontology utilization improve statistical power of annotation enrichment Hinderer, Eugene W. Flight, Robert M. Dubey, Rashmi MacLeod, James N. Moseley, Hunter N. B. PLoS One Research Article Gene-annotation enrichment is a common method for utilizing ontology-based annotations in gene and gene-product centric knowledgebases. Effective utilization of these annotations requires inferring semantic linkages by tracing paths through edges in the ontological graph, referred to as relations. However, some relations are semantically problematic with respect to scope, necessitating their omission or modification lest erroneous term mappings occur. To address these issues, we created the Gene Ontology Categorization Suite, or GOcats—a novel tool that organizes the Gene Ontology into subgraphs representing user-defined concepts, while ensuring that all appropriate relations are congruent with respect to scoping semantics. Here, we demonstrate the improvements in annotation enrichment by re-interpreting edges that would otherwise be omitted by traditional ancestor path-tracing methods. Specifically, we show that GOcats’ unique handling of relations improves enrichment over conventional methods in the analysis of two different gene-expression datasets: a breast cancer microarray dataset and several horse cartilage development RNAseq datasets. With the breast cancer microarray dataset, we observed significant improvement (one-sided binomial test p-value = 1.86E-25) in 182 of 217 significantly enriched GO terms identified from the conventional path traversal method when GOcats’ path traversal was used. We also found new significantly enriched terms using GOcats, whose biological relevancy has been experimentally demonstrated elsewhere. Likewise, on the horse RNAseq datasets, we observed a significant improvement in GO term enrichment when using GOcat’s path traversal: one-sided binomial test p-values range from 1.32E-03 to 2.58E-44. Public Library of Science 2019-08-15 /pmc/articles/PMC6695228/ /pubmed/31415589 http://dx.doi.org/10.1371/journal.pone.0220728 Text en © 2019 Hinderer et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Hinderer, Eugene W.
Flight, Robert M.
Dubey, Rashmi
MacLeod, James N.
Moseley, Hunter N. B.
Advances in gene ontology utilization improve statistical power of annotation enrichment
title Advances in gene ontology utilization improve statistical power of annotation enrichment
title_full Advances in gene ontology utilization improve statistical power of annotation enrichment
title_fullStr Advances in gene ontology utilization improve statistical power of annotation enrichment
title_full_unstemmed Advances in gene ontology utilization improve statistical power of annotation enrichment
title_short Advances in gene ontology utilization improve statistical power of annotation enrichment
title_sort advances in gene ontology utilization improve statistical power of annotation enrichment
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6695228/
https://www.ncbi.nlm.nih.gov/pubmed/31415589
http://dx.doi.org/10.1371/journal.pone.0220728
work_keys_str_mv AT hinderereugenew advancesingeneontologyutilizationimprovestatisticalpowerofannotationenrichment
AT flightrobertm advancesingeneontologyutilizationimprovestatisticalpowerofannotationenrichment
AT dubeyrashmi advancesingeneontologyutilizationimprovestatisticalpowerofannotationenrichment
AT macleodjamesn advancesingeneontologyutilizationimprovestatisticalpowerofannotationenrichment
AT moseleyhunternb advancesingeneontologyutilizationimprovestatisticalpowerofannotationenrichment