Cargando…

Evaluating the consistency of gene sets used in the analysis of bacterial gene expression data

BACKGROUND: Statistical analyses of whole genome expression data require functional information about genes in order to yield meaningful biological conclusions. The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) are common sources of functionally grouped gene sets. For bacteri...

Descripción completa

Detalles Bibliográficos
Autores principales: Tintle, Nathan L, Sitarik, Alexandra, Boerema, Benjamin, Young, Kylie, Best, Aaron A, DeJongh, Matthew
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3462729/
https://www.ncbi.nlm.nih.gov/pubmed/22873695
http://dx.doi.org/10.1186/1471-2105-13-193
_version_ 1782245201273683968
author Tintle, Nathan L
Sitarik, Alexandra
Boerema, Benjamin
Young, Kylie
Best, Aaron A
DeJongh, Matthew
author_facet Tintle, Nathan L
Sitarik, Alexandra
Boerema, Benjamin
Young, Kylie
Best, Aaron A
DeJongh, Matthew
author_sort Tintle, Nathan L
collection PubMed
description BACKGROUND: Statistical analyses of whole genome expression data require functional information about genes in order to yield meaningful biological conclusions. The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) are common sources of functionally grouped gene sets. For bacteria, the SEED and MicrobesOnline provide alternative, complementary sources of gene sets. To date, no comprehensive evaluation of the data obtained from these resources has been performed. RESULTS: We define a series of gene set consistency metrics directly related to the most common classes of statistical analyses for gene expression data, and then perform a comprehensive analysis of 3581 Affymetrix® gene expression arrays across 17 diverse bacteria. We find that gene sets obtained from GO and KEGG demonstrate lower consistency than those obtained from the SEED and MicrobesOnline, regardless of gene set size. CONCLUSIONS: Despite the widespread use of GO and KEGG gene sets in bacterial gene expression data analysis, the SEED and MicrobesOnline provide more consistent sets for a wide variety of statistical analyses. Increased use of the SEED and MicrobesOnline gene sets in the analysis of bacterial gene expression data may improve statistical power and utility of expression data.
format Online
Article
Text
id pubmed-3462729
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-34627292012-10-03 Evaluating the consistency of gene sets used in the analysis of bacterial gene expression data Tintle, Nathan L Sitarik, Alexandra Boerema, Benjamin Young, Kylie Best, Aaron A DeJongh, Matthew BMC Bioinformatics Research Article BACKGROUND: Statistical analyses of whole genome expression data require functional information about genes in order to yield meaningful biological conclusions. The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) are common sources of functionally grouped gene sets. For bacteria, the SEED and MicrobesOnline provide alternative, complementary sources of gene sets. To date, no comprehensive evaluation of the data obtained from these resources has been performed. RESULTS: We define a series of gene set consistency metrics directly related to the most common classes of statistical analyses for gene expression data, and then perform a comprehensive analysis of 3581 Affymetrix® gene expression arrays across 17 diverse bacteria. We find that gene sets obtained from GO and KEGG demonstrate lower consistency than those obtained from the SEED and MicrobesOnline, regardless of gene set size. CONCLUSIONS: Despite the widespread use of GO and KEGG gene sets in bacterial gene expression data analysis, the SEED and MicrobesOnline provide more consistent sets for a wide variety of statistical analyses. Increased use of the SEED and MicrobesOnline gene sets in the analysis of bacterial gene expression data may improve statistical power and utility of expression data. BioMed Central 2012-08-08 /pmc/articles/PMC3462729/ /pubmed/22873695 http://dx.doi.org/10.1186/1471-2105-13-193 Text en Copyright ©2012 Tintle et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Tintle, Nathan L
Sitarik, Alexandra
Boerema, Benjamin
Young, Kylie
Best, Aaron A
DeJongh, Matthew
Evaluating the consistency of gene sets used in the analysis of bacterial gene expression data
title Evaluating the consistency of gene sets used in the analysis of bacterial gene expression data
title_full Evaluating the consistency of gene sets used in the analysis of bacterial gene expression data
title_fullStr Evaluating the consistency of gene sets used in the analysis of bacterial gene expression data
title_full_unstemmed Evaluating the consistency of gene sets used in the analysis of bacterial gene expression data
title_short Evaluating the consistency of gene sets used in the analysis of bacterial gene expression data
title_sort evaluating the consistency of gene sets used in the analysis of bacterial gene expression data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3462729/
https://www.ncbi.nlm.nih.gov/pubmed/22873695
http://dx.doi.org/10.1186/1471-2105-13-193
work_keys_str_mv AT tintlenathanl evaluatingtheconsistencyofgenesetsusedintheanalysisofbacterialgeneexpressiondata
AT sitarikalexandra evaluatingtheconsistencyofgenesetsusedintheanalysisofbacterialgeneexpressiondata
AT boeremabenjamin evaluatingtheconsistencyofgenesetsusedintheanalysisofbacterialgeneexpressiondata
AT youngkylie evaluatingtheconsistencyofgenesetsusedintheanalysisofbacterialgeneexpressiondata
AT bestaarona evaluatingtheconsistencyofgenesetsusedintheanalysisofbacterialgeneexpressiondata
AT dejonghmatthew evaluatingtheconsistencyofgenesetsusedintheanalysisofbacterialgeneexpressiondata