Cargando…
Predicting gene ontology from a global meta-analysis of 1-color microarray experiments
ABSTRACT: BACKGROUND: Global meta-analysis (GMA) of microarray data to identify genes with highly similar co-expression profiles is emerging as an accurate method to predict gene function and phenotype, even in the absence of published data on the gene(s) being analyzed. With a third of human genes...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3236836/ https://www.ncbi.nlm.nih.gov/pubmed/22166114 http://dx.doi.org/10.1186/1471-2105-12-S10-S14 |
_version_ | 1782218792907046912 |
---|---|
author | Dozmorov, Mikhail G Giles, Cory B Wren, Jonathan D |
author_facet | Dozmorov, Mikhail G Giles, Cory B Wren, Jonathan D |
author_sort | Dozmorov, Mikhail G |
collection | PubMed |
description | ABSTRACT: BACKGROUND: Global meta-analysis (GMA) of microarray data to identify genes with highly similar co-expression profiles is emerging as an accurate method to predict gene function and phenotype, even in the absence of published data on the gene(s) being analyzed. With a third of human genes still uncharacterized, this approach is a promising way to direct experiments and rapidly understand the biological roles of genes. To predict function for genes of interest, GMA relies on a guilt-by-association approach to identify sets of genes with known functions that are consistently co-expressed with it across different experimental conditions, suggesting coordinated regulation for a specific biological purpose. Our goal here is to define how sample, dataset size and ranking parameters affect prediction performance. RESULTS: 13,000 human 1-color microarrays were downloaded from GEO for GMA analysis. Prediction performance was benchmarked by calculating the distance within the Gene Ontology (GO) tree between predicted function and annotated function for sets of 100 randomly selected genes. We find the number of new predicted functions rises as more datasets are added, but begins to saturate at a sample size of approximately 2,000 experiments. For the gene set used to predict function, we find precision to be higher with smaller set sizes, yet with correspondingly poor recall and, as set size is increased, recall and F-measure also tend to increase but at the cost of precision. CONCLUSIONS: Of the 20,813 genes expressed in 50 or more experiments, at least one predicted GO category was found for 72.5% of them. Of the 5,720 genes without GO annotation, 4,189 had at least one predicted ontology using top 40 co-expressed genes for prediction analysis. For the remaining 1,531 genes without GO predictions or annotations, ~17% (257 genes) had sufficient co-expression data yet no statistically significantly overrepresented ontologies, suggesting their regulation may be more complex. |
format | Online Article Text |
id | pubmed-3236836 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-32368362011-12-14 Predicting gene ontology from a global meta-analysis of 1-color microarray experiments Dozmorov, Mikhail G Giles, Cory B Wren, Jonathan D BMC Bioinformatics Proceedings ABSTRACT: BACKGROUND: Global meta-analysis (GMA) of microarray data to identify genes with highly similar co-expression profiles is emerging as an accurate method to predict gene function and phenotype, even in the absence of published data on the gene(s) being analyzed. With a third of human genes still uncharacterized, this approach is a promising way to direct experiments and rapidly understand the biological roles of genes. To predict function for genes of interest, GMA relies on a guilt-by-association approach to identify sets of genes with known functions that are consistently co-expressed with it across different experimental conditions, suggesting coordinated regulation for a specific biological purpose. Our goal here is to define how sample, dataset size and ranking parameters affect prediction performance. RESULTS: 13,000 human 1-color microarrays were downloaded from GEO for GMA analysis. Prediction performance was benchmarked by calculating the distance within the Gene Ontology (GO) tree between predicted function and annotated function for sets of 100 randomly selected genes. We find the number of new predicted functions rises as more datasets are added, but begins to saturate at a sample size of approximately 2,000 experiments. For the gene set used to predict function, we find precision to be higher with smaller set sizes, yet with correspondingly poor recall and, as set size is increased, recall and F-measure also tend to increase but at the cost of precision. CONCLUSIONS: Of the 20,813 genes expressed in 50 or more experiments, at least one predicted GO category was found for 72.5% of them. Of the 5,720 genes without GO annotation, 4,189 had at least one predicted ontology using top 40 co-expressed genes for prediction analysis. For the remaining 1,531 genes without GO predictions or annotations, ~17% (257 genes) had sufficient co-expression data yet no statistically significantly overrepresented ontologies, suggesting their regulation may be more complex. BioMed Central 2011-10-18 /pmc/articles/PMC3236836/ /pubmed/22166114 http://dx.doi.org/10.1186/1471-2105-12-S10-S14 Text en Copyright ©2011 Dozmorov et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Proceedings Dozmorov, Mikhail G Giles, Cory B Wren, Jonathan D Predicting gene ontology from a global meta-analysis of 1-color microarray experiments |
title | Predicting gene ontology from a global meta-analysis of 1-color microarray experiments |
title_full | Predicting gene ontology from a global meta-analysis of 1-color microarray experiments |
title_fullStr | Predicting gene ontology from a global meta-analysis of 1-color microarray experiments |
title_full_unstemmed | Predicting gene ontology from a global meta-analysis of 1-color microarray experiments |
title_short | Predicting gene ontology from a global meta-analysis of 1-color microarray experiments |
title_sort | predicting gene ontology from a global meta-analysis of 1-color microarray experiments |
topic | Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3236836/ https://www.ncbi.nlm.nih.gov/pubmed/22166114 http://dx.doi.org/10.1186/1471-2105-12-S10-S14 |
work_keys_str_mv | AT dozmorovmikhailg predictinggeneontologyfromaglobalmetaanalysisof1colormicroarrayexperiments AT gilescoryb predictinggeneontologyfromaglobalmetaanalysisof1colormicroarrayexperiments AT wrenjonathand predictinggeneontologyfromaglobalmetaanalysisof1colormicroarrayexperiments |