Cargando…

Cluster analysis of protein array results via similarity of Gene Ontology annotation

BACKGROUND: With the advent of high-throughput proteomic experiments such as arrays of purified proteins comes the need to analyse sets of proteins as an ensemble, as opposed to the traditional one-protein-at-a-time approach. Although there are several publicly available tools that facilitate the an...

Descripción completa

Detalles Bibliográficos
Autores principales: Wolting, Cheryl, McGlade, C Jane, Tritchler, David
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1539024/
https://www.ncbi.nlm.nih.gov/pubmed/16836750
http://dx.doi.org/10.1186/1471-2105-7-338
_version_ 1782129161254469632
author Wolting, Cheryl
McGlade, C Jane
Tritchler, David
author_facet Wolting, Cheryl
McGlade, C Jane
Tritchler, David
author_sort Wolting, Cheryl
collection PubMed
description BACKGROUND: With the advent of high-throughput proteomic experiments such as arrays of purified proteins comes the need to analyse sets of proteins as an ensemble, as opposed to the traditional one-protein-at-a-time approach. Although there are several publicly available tools that facilitate the analysis of protein sets, they do not display integrated results in an easily-interpreted image or do not allow the user to specify the proteins to be analysed. RESULTS: We developed a novel computational approach to analyse the annotation of sets of molecules. As proof of principle, we analysed two sets of proteins identified in published protein array screens. The distance between any two proteins was measured as the graph similarity between their Gene Ontology (GO) annotations. These distances were then clustered to highlight subsets of proteins sharing related GO annotation. In the first set of proteins found to bind small molecule inhibitors of rapamycin, we identified three subsets containing four or five proteins each that may help to elucidate how rapamycin affects cell growth whereas the original authors chose only one novel protein from the array results for further study. In a set of phosphoinositide-binding proteins, we identified subsets of proteins associated with different intracellular structures that were not highlighted by the analysis performed in the original publication. CONCLUSION: By determining the distances between annotations, our methodology reveals trends and enrichment of proteins of particular functions within high-throughput datasets at a higher sensitivity than perusal of end-point annotations. In an era of increasingly complex datasets, such tools will help in the formulation of new, testable hypotheses from high-throughput experimental data.
format Text
id pubmed-1539024
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-15390242006-08-14 Cluster analysis of protein array results via similarity of Gene Ontology annotation Wolting, Cheryl McGlade, C Jane Tritchler, David BMC Bioinformatics Methodology Article BACKGROUND: With the advent of high-throughput proteomic experiments such as arrays of purified proteins comes the need to analyse sets of proteins as an ensemble, as opposed to the traditional one-protein-at-a-time approach. Although there are several publicly available tools that facilitate the analysis of protein sets, they do not display integrated results in an easily-interpreted image or do not allow the user to specify the proteins to be analysed. RESULTS: We developed a novel computational approach to analyse the annotation of sets of molecules. As proof of principle, we analysed two sets of proteins identified in published protein array screens. The distance between any two proteins was measured as the graph similarity between their Gene Ontology (GO) annotations. These distances were then clustered to highlight subsets of proteins sharing related GO annotation. In the first set of proteins found to bind small molecule inhibitors of rapamycin, we identified three subsets containing four or five proteins each that may help to elucidate how rapamycin affects cell growth whereas the original authors chose only one novel protein from the array results for further study. In a set of phosphoinositide-binding proteins, we identified subsets of proteins associated with different intracellular structures that were not highlighted by the analysis performed in the original publication. CONCLUSION: By determining the distances between annotations, our methodology reveals trends and enrichment of proteins of particular functions within high-throughput datasets at a higher sensitivity than perusal of end-point annotations. In an era of increasingly complex datasets, such tools will help in the formulation of new, testable hypotheses from high-throughput experimental data. BioMed Central 2006-07-12 /pmc/articles/PMC1539024/ /pubmed/16836750 http://dx.doi.org/10.1186/1471-2105-7-338 Text en Copyright © 2006 Wolting et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Wolting, Cheryl
McGlade, C Jane
Tritchler, David
Cluster analysis of protein array results via similarity of Gene Ontology annotation
title Cluster analysis of protein array results via similarity of Gene Ontology annotation
title_full Cluster analysis of protein array results via similarity of Gene Ontology annotation
title_fullStr Cluster analysis of protein array results via similarity of Gene Ontology annotation
title_full_unstemmed Cluster analysis of protein array results via similarity of Gene Ontology annotation
title_short Cluster analysis of protein array results via similarity of Gene Ontology annotation
title_sort cluster analysis of protein array results via similarity of gene ontology annotation
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1539024/
https://www.ncbi.nlm.nih.gov/pubmed/16836750
http://dx.doi.org/10.1186/1471-2105-7-338
work_keys_str_mv AT woltingcheryl clusteranalysisofproteinarrayresultsviasimilarityofgeneontologyannotation
AT mcgladecjane clusteranalysisofproteinarrayresultsviasimilarityofgeneontologyannotation
AT tritchlerdavid clusteranalysisofproteinarrayresultsviasimilarityofgeneontologyannotation