Cargando…

RedundancyMiner: De-replication of redundant GO categories in microarray and proteomics analysis

BACKGROUND: The Gene Ontology (GO) Consortium organizes genes into hierarchical categories based on biological process, molecular function and subcellular localization. Tools such as GoMiner can leverage GO to perform ontological analysis of microarray and proteomics studies, typically generating a...

Descripción completa

Detalles Bibliográficos
Autores principales: Zeeberg, Barry R, Liu, Hongfang, Kahn, Ari B, Ehler, Martin, Rajapakse, Vinodh N, Bonner, Robert F, Brown, Jacob D, Brooks, Brian P, Larionov, Vladimir L, Reinhold, William, Weinstein, John N, Pommier, Yves G
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3223614/
https://www.ncbi.nlm.nih.gov/pubmed/21310028
http://dx.doi.org/10.1186/1471-2105-12-52
_version_ 1782217306201391104
author Zeeberg, Barry R
Liu, Hongfang
Kahn, Ari B
Ehler, Martin
Rajapakse, Vinodh N
Bonner, Robert F
Brown, Jacob D
Brooks, Brian P
Larionov, Vladimir L
Reinhold, William
Weinstein, John N
Pommier, Yves G
author_facet Zeeberg, Barry R
Liu, Hongfang
Kahn, Ari B
Ehler, Martin
Rajapakse, Vinodh N
Bonner, Robert F
Brown, Jacob D
Brooks, Brian P
Larionov, Vladimir L
Reinhold, William
Weinstein, John N
Pommier, Yves G
author_sort Zeeberg, Barry R
collection PubMed
description BACKGROUND: The Gene Ontology (GO) Consortium organizes genes into hierarchical categories based on biological process, molecular function and subcellular localization. Tools such as GoMiner can leverage GO to perform ontological analysis of microarray and proteomics studies, typically generating a list of significant functional categories. Two or more of the categories are often redundant, in the sense that identical or nearly-identical sets of genes map to the categories. The redundancy might typically inflate the report of significant categories by a factor of three-fold, create an illusion of an overly long list of significant categories, and obscure the relevant biological interpretation. RESULTS: We now introduce a new resource, RedundancyMiner, that de-replicates the redundant and nearly-redundant GO categories that had been determined by first running GoMiner. The main algorithm of RedundancyMiner, MultiClust, performs a novel form of cluster analysis in which a GO category might belong to several category clusters. Each category cluster follows a "complete linkage" paradigm. The metric is a similarity measure that captures the overlap in gene mapping between pairs of categories. CONCLUSIONS: RedundancyMiner effectively eliminated redundancies from a set of GO categories. For illustration, we have applied it to the clarification of the results arising from two current studies: (1) assessment of the gene expression profiles obtained by laser capture microdissection (LCM) of serial cryosections of the retina at the site of final optic fissure closure in the mouse embryos at specific embryonic stages, and (2) analysis of a conceptual data set obtained by examining a list of genes deemed to be "kinetochore" genes.
format Online
Article
Text
id pubmed-3223614
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-32236142011-11-26 RedundancyMiner: De-replication of redundant GO categories in microarray and proteomics analysis Zeeberg, Barry R Liu, Hongfang Kahn, Ari B Ehler, Martin Rajapakse, Vinodh N Bonner, Robert F Brown, Jacob D Brooks, Brian P Larionov, Vladimir L Reinhold, William Weinstein, John N Pommier, Yves G BMC Bioinformatics Software BACKGROUND: The Gene Ontology (GO) Consortium organizes genes into hierarchical categories based on biological process, molecular function and subcellular localization. Tools such as GoMiner can leverage GO to perform ontological analysis of microarray and proteomics studies, typically generating a list of significant functional categories. Two or more of the categories are often redundant, in the sense that identical or nearly-identical sets of genes map to the categories. The redundancy might typically inflate the report of significant categories by a factor of three-fold, create an illusion of an overly long list of significant categories, and obscure the relevant biological interpretation. RESULTS: We now introduce a new resource, RedundancyMiner, that de-replicates the redundant and nearly-redundant GO categories that had been determined by first running GoMiner. The main algorithm of RedundancyMiner, MultiClust, performs a novel form of cluster analysis in which a GO category might belong to several category clusters. Each category cluster follows a "complete linkage" paradigm. The metric is a similarity measure that captures the overlap in gene mapping between pairs of categories. CONCLUSIONS: RedundancyMiner effectively eliminated redundancies from a set of GO categories. For illustration, we have applied it to the clarification of the results arising from two current studies: (1) assessment of the gene expression profiles obtained by laser capture microdissection (LCM) of serial cryosections of the retina at the site of final optic fissure closure in the mouse embryos at specific embryonic stages, and (2) analysis of a conceptual data set obtained by examining a list of genes deemed to be "kinetochore" genes. BioMed Central 2011-02-10 /pmc/articles/PMC3223614/ /pubmed/21310028 http://dx.doi.org/10.1186/1471-2105-12-52 Text en Copyright © 2011 Zeeberg et al; licensee BioMed Central Ltd. https://creativecommons.org/licenses/by/2.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0 (https://creativecommons.org/licenses/by/2.0/) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Zeeberg, Barry R
Liu, Hongfang
Kahn, Ari B
Ehler, Martin
Rajapakse, Vinodh N
Bonner, Robert F
Brown, Jacob D
Brooks, Brian P
Larionov, Vladimir L
Reinhold, William
Weinstein, John N
Pommier, Yves G
RedundancyMiner: De-replication of redundant GO categories in microarray and proteomics analysis
title RedundancyMiner: De-replication of redundant GO categories in microarray and proteomics analysis
title_full RedundancyMiner: De-replication of redundant GO categories in microarray and proteomics analysis
title_fullStr RedundancyMiner: De-replication of redundant GO categories in microarray and proteomics analysis
title_full_unstemmed RedundancyMiner: De-replication of redundant GO categories in microarray and proteomics analysis
title_short RedundancyMiner: De-replication of redundant GO categories in microarray and proteomics analysis
title_sort redundancyminer: de-replication of redundant go categories in microarray and proteomics analysis
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3223614/
https://www.ncbi.nlm.nih.gov/pubmed/21310028
http://dx.doi.org/10.1186/1471-2105-12-52
work_keys_str_mv AT zeebergbarryr redundancyminerdereplicationofredundantgocategoriesinmicroarrayandproteomicsanalysis
AT liuhongfang redundancyminerdereplicationofredundantgocategoriesinmicroarrayandproteomicsanalysis
AT kahnarib redundancyminerdereplicationofredundantgocategoriesinmicroarrayandproteomicsanalysis
AT ehlermartin redundancyminerdereplicationofredundantgocategoriesinmicroarrayandproteomicsanalysis
AT rajapaksevinodhn redundancyminerdereplicationofredundantgocategoriesinmicroarrayandproteomicsanalysis
AT bonnerrobertf redundancyminerdereplicationofredundantgocategoriesinmicroarrayandproteomicsanalysis
AT brownjacobd redundancyminerdereplicationofredundantgocategoriesinmicroarrayandproteomicsanalysis
AT brooksbrianp redundancyminerdereplicationofredundantgocategoriesinmicroarrayandproteomicsanalysis
AT larionovvladimirl redundancyminerdereplicationofredundantgocategoriesinmicroarrayandproteomicsanalysis
AT reinholdwilliam redundancyminerdereplicationofredundantgocategoriesinmicroarrayandproteomicsanalysis
AT weinsteinjohnn redundancyminerdereplicationofredundantgocategoriesinmicroarrayandproteomicsanalysis
AT pommieryvesg redundancyminerdereplicationofredundantgocategoriesinmicroarrayandproteomicsanalysis