Cargando…
Categorizer: a tool to categorize genes into user-defined biological groups based on semantic similarity
BACKGROUND: Communalities between large sets of genes obtained from high-throughput experiments are often identified by searching for enrichments of genes with the same Gene Ontology (GO) annotations. The GO analysis tools used for these enrichment analyses assume that GO terms are independent and t...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4298957/ https://www.ncbi.nlm.nih.gov/pubmed/25495442 http://dx.doi.org/10.1186/1471-2164-15-1091 |
_version_ | 1782353331854770176 |
---|---|
author | Na, Dokyun Son, Hyungbin Gsponer, Jörg |
author_facet | Na, Dokyun Son, Hyungbin Gsponer, Jörg |
author_sort | Na, Dokyun |
collection | PubMed |
description | BACKGROUND: Communalities between large sets of genes obtained from high-throughput experiments are often identified by searching for enrichments of genes with the same Gene Ontology (GO) annotations. The GO analysis tools used for these enrichment analyses assume that GO terms are independent and the semantic distances between all parent–child terms are identical, which is not true in a biological sense. In addition these tools output lists of often redundant or too specific GO terms, which are difficult to interpret in the context of the biological question investigated by the user. Therefore, there is a demand for a robust and reliable method for gene categorization and enrichment analysis. RESULTS: We have developed Categorizer, a tool that classifies genes into user-defined groups (categories) and calculates p-values for the enrichment of the categories. Categorizer identifies the biologically best-fit category for each gene by taking advantage of a specialized semantic similarity measure for GO terms. We demonstrate that Categorizer provides improved categorization and enrichment results of genetic modifiers of Huntington’s disease compared to a classical GO Slim-based approach or categorizations using other semantic similarity measures. CONCLUSION: Categorizer enables more accurate categorizations of genes than currently available methods. This new tool will help experimental and computational biologists analyzing genomic and proteomic data according to their specific needs in a more reliable manner. |
format | Online Article Text |
id | pubmed-4298957 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-42989572015-01-21 Categorizer: a tool to categorize genes into user-defined biological groups based on semantic similarity Na, Dokyun Son, Hyungbin Gsponer, Jörg BMC Genomics Software BACKGROUND: Communalities between large sets of genes obtained from high-throughput experiments are often identified by searching for enrichments of genes with the same Gene Ontology (GO) annotations. The GO analysis tools used for these enrichment analyses assume that GO terms are independent and the semantic distances between all parent–child terms are identical, which is not true in a biological sense. In addition these tools output lists of often redundant or too specific GO terms, which are difficult to interpret in the context of the biological question investigated by the user. Therefore, there is a demand for a robust and reliable method for gene categorization and enrichment analysis. RESULTS: We have developed Categorizer, a tool that classifies genes into user-defined groups (categories) and calculates p-values for the enrichment of the categories. Categorizer identifies the biologically best-fit category for each gene by taking advantage of a specialized semantic similarity measure for GO terms. We demonstrate that Categorizer provides improved categorization and enrichment results of genetic modifiers of Huntington’s disease compared to a classical GO Slim-based approach or categorizations using other semantic similarity measures. CONCLUSION: Categorizer enables more accurate categorizations of genes than currently available methods. This new tool will help experimental and computational biologists analyzing genomic and proteomic data according to their specific needs in a more reliable manner. BioMed Central 2014-12-11 /pmc/articles/PMC4298957/ /pubmed/25495442 http://dx.doi.org/10.1186/1471-2164-15-1091 Text en © Na et al.; licensee BioMed Central. 2014 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Software Na, Dokyun Son, Hyungbin Gsponer, Jörg Categorizer: a tool to categorize genes into user-defined biological groups based on semantic similarity |
title | Categorizer: a tool to categorize genes into user-defined biological groups based on semantic similarity |
title_full | Categorizer: a tool to categorize genes into user-defined biological groups based on semantic similarity |
title_fullStr | Categorizer: a tool to categorize genes into user-defined biological groups based on semantic similarity |
title_full_unstemmed | Categorizer: a tool to categorize genes into user-defined biological groups based on semantic similarity |
title_short | Categorizer: a tool to categorize genes into user-defined biological groups based on semantic similarity |
title_sort | categorizer: a tool to categorize genes into user-defined biological groups based on semantic similarity |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4298957/ https://www.ncbi.nlm.nih.gov/pubmed/25495442 http://dx.doi.org/10.1186/1471-2164-15-1091 |
work_keys_str_mv | AT nadokyun categorizeratooltocategorizegenesintouserdefinedbiologicalgroupsbasedonsemanticsimilarity AT sonhyungbin categorizeratooltocategorizegenesintouserdefinedbiologicalgroupsbasedonsemanticsimilarity AT gsponerjorg categorizeratooltocategorizegenesintouserdefinedbiologicalgroupsbasedonsemanticsimilarity |