Cargando…
Literature-aided meta-analysis of microarray data: a compendium study on muscle development and disease
BACKGROUND: Comparative analysis of expression microarray studies is difficult due to the large influence of technical factors on experimental outcome. Still, the identified differentially expressed genes may hint at the same biological processes. However, manually curated assignment of genes to bio...
Autores principales: | , , , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2008
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2459190/ https://www.ncbi.nlm.nih.gov/pubmed/18577208 http://dx.doi.org/10.1186/1471-2105-9-291 |
_version_ | 1782157411848552448 |
---|---|
author | Jelier, Rob 't Hoen, Peter AC Sterrenburg, Ellen den Dunnen, Johan T van Ommen, Gert-Jan B Kors, Jan A Mons, Barend |
author_facet | Jelier, Rob 't Hoen, Peter AC Sterrenburg, Ellen den Dunnen, Johan T van Ommen, Gert-Jan B Kors, Jan A Mons, Barend |
author_sort | Jelier, Rob |
collection | PubMed |
description | BACKGROUND: Comparative analysis of expression microarray studies is difficult due to the large influence of technical factors on experimental outcome. Still, the identified differentially expressed genes may hint at the same biological processes. However, manually curated assignment of genes to biological processes, such as pursued by the Gene Ontology (GO) consortium, is incomplete and limited. We hypothesised that automatic association of genes with biological processes through thesaurus-controlled mining of Medline abstracts would be more effective. Therefore, we developed a novel algorithm (LAMA: Literature-Aided Meta-Analysis) to quantify the similarity between transcriptomics studies. We evaluated our algorithm on a large compendium of 102 microarray studies published in the field of muscle development and disease, and compared it to similarity measures based on gene overlap and over-representation of biological processes assigned by GO. RESULTS: While the overlap in both genes and overrepresented GO-terms was poor, LAMA retrieved many more biologically meaningful links between studies, with substantially lower influence of technical factors. LAMA correctly grouped muscular dystrophy, regeneration and myositis studies, and linked patient and corresponding mouse model studies. LAMA also retrieves the connecting biological concepts. Among other new discoveries, we associated cullin proteins, a class of ubiquitinylation proteins, with genes down-regulated during muscle regeneration, whereas ubiquitinylation was previously reported to be activated during the inverse process: muscle atrophy. CONCLUSION: Our literature-based association analysis is capable of finding hidden common biological denominators in microarray studies, and circumvents the need for raw data analysis or curated gene annotation databases. |
format | Text |
id | pubmed-2459190 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2008 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-24591902008-07-14 Literature-aided meta-analysis of microarray data: a compendium study on muscle development and disease Jelier, Rob 't Hoen, Peter AC Sterrenburg, Ellen den Dunnen, Johan T van Ommen, Gert-Jan B Kors, Jan A Mons, Barend BMC Bioinformatics Methodology Article BACKGROUND: Comparative analysis of expression microarray studies is difficult due to the large influence of technical factors on experimental outcome. Still, the identified differentially expressed genes may hint at the same biological processes. However, manually curated assignment of genes to biological processes, such as pursued by the Gene Ontology (GO) consortium, is incomplete and limited. We hypothesised that automatic association of genes with biological processes through thesaurus-controlled mining of Medline abstracts would be more effective. Therefore, we developed a novel algorithm (LAMA: Literature-Aided Meta-Analysis) to quantify the similarity between transcriptomics studies. We evaluated our algorithm on a large compendium of 102 microarray studies published in the field of muscle development and disease, and compared it to similarity measures based on gene overlap and over-representation of biological processes assigned by GO. RESULTS: While the overlap in both genes and overrepresented GO-terms was poor, LAMA retrieved many more biologically meaningful links between studies, with substantially lower influence of technical factors. LAMA correctly grouped muscular dystrophy, regeneration and myositis studies, and linked patient and corresponding mouse model studies. LAMA also retrieves the connecting biological concepts. Among other new discoveries, we associated cullin proteins, a class of ubiquitinylation proteins, with genes down-regulated during muscle regeneration, whereas ubiquitinylation was previously reported to be activated during the inverse process: muscle atrophy. CONCLUSION: Our literature-based association analysis is capable of finding hidden common biological denominators in microarray studies, and circumvents the need for raw data analysis or curated gene annotation databases. BioMed Central 2008-06-24 /pmc/articles/PMC2459190/ /pubmed/18577208 http://dx.doi.org/10.1186/1471-2105-9-291 Text en Copyright © 2008 Jelier et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Article Jelier, Rob 't Hoen, Peter AC Sterrenburg, Ellen den Dunnen, Johan T van Ommen, Gert-Jan B Kors, Jan A Mons, Barend Literature-aided meta-analysis of microarray data: a compendium study on muscle development and disease |
title | Literature-aided meta-analysis of microarray data: a compendium study on muscle development and disease |
title_full | Literature-aided meta-analysis of microarray data: a compendium study on muscle development and disease |
title_fullStr | Literature-aided meta-analysis of microarray data: a compendium study on muscle development and disease |
title_full_unstemmed | Literature-aided meta-analysis of microarray data: a compendium study on muscle development and disease |
title_short | Literature-aided meta-analysis of microarray data: a compendium study on muscle development and disease |
title_sort | literature-aided meta-analysis of microarray data: a compendium study on muscle development and disease |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2459190/ https://www.ncbi.nlm.nih.gov/pubmed/18577208 http://dx.doi.org/10.1186/1471-2105-9-291 |
work_keys_str_mv | AT jelierrob literatureaidedmetaanalysisofmicroarraydataacompendiumstudyonmuscledevelopmentanddisease AT thoenpeterac literatureaidedmetaanalysisofmicroarraydataacompendiumstudyonmuscledevelopmentanddisease AT sterrenburgellen literatureaidedmetaanalysisofmicroarraydataacompendiumstudyonmuscledevelopmentanddisease AT dendunnenjohant literatureaidedmetaanalysisofmicroarraydataacompendiumstudyonmuscledevelopmentanddisease AT vanommengertjanb literatureaidedmetaanalysisofmicroarraydataacompendiumstudyonmuscledevelopmentanddisease AT korsjana literatureaidedmetaanalysisofmicroarraydataacompendiumstudyonmuscledevelopmentanddisease AT monsbarend literatureaidedmetaanalysisofmicroarraydataacompendiumstudyonmuscledevelopmentanddisease |