Cargando…

Literature-aided meta-analysis of microarray data: a compendium study on muscle development and disease

BACKGROUND: Comparative analysis of expression microarray studies is difficult due to the large influence of technical factors on experimental outcome. Still, the identified differentially expressed genes may hint at the same biological processes. However, manually curated assignment of genes to bio...

Descripción completa

Detalles Bibliográficos
Autores principales: Jelier, Rob, 't Hoen, Peter AC, Sterrenburg, Ellen, den Dunnen, Johan T, van Ommen, Gert-Jan B, Kors, Jan A, Mons, Barend
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2459190/
https://www.ncbi.nlm.nih.gov/pubmed/18577208
http://dx.doi.org/10.1186/1471-2105-9-291
_version_ 1782157411848552448
author Jelier, Rob
't Hoen, Peter AC
Sterrenburg, Ellen
den Dunnen, Johan T
van Ommen, Gert-Jan B
Kors, Jan A
Mons, Barend
author_facet Jelier, Rob
't Hoen, Peter AC
Sterrenburg, Ellen
den Dunnen, Johan T
van Ommen, Gert-Jan B
Kors, Jan A
Mons, Barend
author_sort Jelier, Rob
collection PubMed
description BACKGROUND: Comparative analysis of expression microarray studies is difficult due to the large influence of technical factors on experimental outcome. Still, the identified differentially expressed genes may hint at the same biological processes. However, manually curated assignment of genes to biological processes, such as pursued by the Gene Ontology (GO) consortium, is incomplete and limited. We hypothesised that automatic association of genes with biological processes through thesaurus-controlled mining of Medline abstracts would be more effective. Therefore, we developed a novel algorithm (LAMA: Literature-Aided Meta-Analysis) to quantify the similarity between transcriptomics studies. We evaluated our algorithm on a large compendium of 102 microarray studies published in the field of muscle development and disease, and compared it to similarity measures based on gene overlap and over-representation of biological processes assigned by GO. RESULTS: While the overlap in both genes and overrepresented GO-terms was poor, LAMA retrieved many more biologically meaningful links between studies, with substantially lower influence of technical factors. LAMA correctly grouped muscular dystrophy, regeneration and myositis studies, and linked patient and corresponding mouse model studies. LAMA also retrieves the connecting biological concepts. Among other new discoveries, we associated cullin proteins, a class of ubiquitinylation proteins, with genes down-regulated during muscle regeneration, whereas ubiquitinylation was previously reported to be activated during the inverse process: muscle atrophy. CONCLUSION: Our literature-based association analysis is capable of finding hidden common biological denominators in microarray studies, and circumvents the need for raw data analysis or curated gene annotation databases.
format Text
id pubmed-2459190
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-24591902008-07-14 Literature-aided meta-analysis of microarray data: a compendium study on muscle development and disease Jelier, Rob 't Hoen, Peter AC Sterrenburg, Ellen den Dunnen, Johan T van Ommen, Gert-Jan B Kors, Jan A Mons, Barend BMC Bioinformatics Methodology Article BACKGROUND: Comparative analysis of expression microarray studies is difficult due to the large influence of technical factors on experimental outcome. Still, the identified differentially expressed genes may hint at the same biological processes. However, manually curated assignment of genes to biological processes, such as pursued by the Gene Ontology (GO) consortium, is incomplete and limited. We hypothesised that automatic association of genes with biological processes through thesaurus-controlled mining of Medline abstracts would be more effective. Therefore, we developed a novel algorithm (LAMA: Literature-Aided Meta-Analysis) to quantify the similarity between transcriptomics studies. We evaluated our algorithm on a large compendium of 102 microarray studies published in the field of muscle development and disease, and compared it to similarity measures based on gene overlap and over-representation of biological processes assigned by GO. RESULTS: While the overlap in both genes and overrepresented GO-terms was poor, LAMA retrieved many more biologically meaningful links between studies, with substantially lower influence of technical factors. LAMA correctly grouped muscular dystrophy, regeneration and myositis studies, and linked patient and corresponding mouse model studies. LAMA also retrieves the connecting biological concepts. Among other new discoveries, we associated cullin proteins, a class of ubiquitinylation proteins, with genes down-regulated during muscle regeneration, whereas ubiquitinylation was previously reported to be activated during the inverse process: muscle atrophy. CONCLUSION: Our literature-based association analysis is capable of finding hidden common biological denominators in microarray studies, and circumvents the need for raw data analysis or curated gene annotation databases. BioMed Central 2008-06-24 /pmc/articles/PMC2459190/ /pubmed/18577208 http://dx.doi.org/10.1186/1471-2105-9-291 Text en Copyright © 2008 Jelier et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Jelier, Rob
't Hoen, Peter AC
Sterrenburg, Ellen
den Dunnen, Johan T
van Ommen, Gert-Jan B
Kors, Jan A
Mons, Barend
Literature-aided meta-analysis of microarray data: a compendium study on muscle development and disease
title Literature-aided meta-analysis of microarray data: a compendium study on muscle development and disease
title_full Literature-aided meta-analysis of microarray data: a compendium study on muscle development and disease
title_fullStr Literature-aided meta-analysis of microarray data: a compendium study on muscle development and disease
title_full_unstemmed Literature-aided meta-analysis of microarray data: a compendium study on muscle development and disease
title_short Literature-aided meta-analysis of microarray data: a compendium study on muscle development and disease
title_sort literature-aided meta-analysis of microarray data: a compendium study on muscle development and disease
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2459190/
https://www.ncbi.nlm.nih.gov/pubmed/18577208
http://dx.doi.org/10.1186/1471-2105-9-291
work_keys_str_mv AT jelierrob literatureaidedmetaanalysisofmicroarraydataacompendiumstudyonmuscledevelopmentanddisease
AT thoenpeterac literatureaidedmetaanalysisofmicroarraydataacompendiumstudyonmuscledevelopmentanddisease
AT sterrenburgellen literatureaidedmetaanalysisofmicroarraydataacompendiumstudyonmuscledevelopmentanddisease
AT dendunnenjohant literatureaidedmetaanalysisofmicroarraydataacompendiumstudyonmuscledevelopmentanddisease
AT vanommengertjanb literatureaidedmetaanalysisofmicroarraydataacompendiumstudyonmuscledevelopmentanddisease
AT korsjana literatureaidedmetaanalysisofmicroarraydataacompendiumstudyonmuscledevelopmentanddisease
AT monsbarend literatureaidedmetaanalysisofmicroarraydataacompendiumstudyonmuscledevelopmentanddisease