Cargando…

Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data

BACKGROUND: Availability of chemical response-specific lists of genes (gene sets) for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM), and that these can be used with gene set anal...

Descripción completa

Detalles Bibliográficos
Autores principales: Hettne, Kristina M, Boorsma, André, van Dartel, Dorien A M, Goeman, Jelle J, de Jong, Esther, Piersma, Aldert H, Stierum, Rob H, Kleinjans, Jos C, Kors, Jan A
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3572439/
https://www.ncbi.nlm.nih.gov/pubmed/23356878
http://dx.doi.org/10.1186/1755-8794-6-2
_version_ 1782259325916413952
author Hettne, Kristina M
Boorsma, André
van Dartel, Dorien A M
Goeman, Jelle J
de Jong, Esther
Piersma, Aldert H
Stierum, Rob H
Kleinjans, Jos C
Kors, Jan A
author_facet Hettne, Kristina M
Boorsma, André
van Dartel, Dorien A M
Goeman, Jelle J
de Jong, Esther
Piersma, Aldert H
Stierum, Rob H
Kleinjans, Jos C
Kors, Jan A
author_sort Hettne, Kristina M
collection PubMed
description BACKGROUND: Availability of chemical response-specific lists of genes (gene sets) for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM), and that these can be used with gene set analysis (GSA) methods for chemical treatment identification, for pharmacological mechanism elucidation, and for comparing compound toxicity profiles. METHODS: We created 30,211 chemical response-specific gene sets for human and mouse by next-gen TM, and derived 1,189 (human) and 588 (mouse) gene sets from the Comparative Toxicogenomics Database (CTD). We tested for significant differential expression (SDE) (false discovery rate -corrected p-values < 0.05) of the next-gen TM-derived gene sets and the CTD-derived gene sets in gene expression (GE) data sets of five chemicals (from experimental models). We tested for SDE of gene sets for six fibrates in a peroxisome proliferator-activated receptor alpha (PPARA) knock-out GE dataset and compared to results from the Connectivity Map. We tested for SDE of 319 next-gen TM-derived gene sets for environmental toxicants in three GE data sets of triazoles, and tested for SDE of 442 gene sets associated with embryonic structures. We compared the gene sets to triazole effects seen in the Whole Embryo Culture (WEC), and used principal component analysis (PCA) to discriminate triazoles from other chemicals. RESULTS: Next-gen TM-derived gene sets matching the chemical treatment were significantly altered in three GE data sets, and the corresponding CTD-derived gene sets were significantly altered in five GE data sets. Six next-gen TM-derived and four CTD-derived fibrate gene sets were significantly altered in the PPARA knock-out GE dataset. None of the fibrate signatures in cMap scored significant against the PPARA GE signature. 33 environmental toxicant gene sets were significantly altered in the triazole GE data sets. 21 of these toxicants had a similar toxicity pattern as the triazoles. We confirmed embryotoxic effects, and discriminated triazoles from other chemicals. CONCLUSIONS: Gene set analysis with next-gen TM-derived chemical response-specific gene sets is a scalable method for identifying similarities in gene responses to other chemicals, from which one may infer potential mode of action and/or toxic effect.
format Online
Article
Text
id pubmed-3572439
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35724392013-02-14 Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data Hettne, Kristina M Boorsma, André van Dartel, Dorien A M Goeman, Jelle J de Jong, Esther Piersma, Aldert H Stierum, Rob H Kleinjans, Jos C Kors, Jan A BMC Med Genomics Research Article BACKGROUND: Availability of chemical response-specific lists of genes (gene sets) for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM), and that these can be used with gene set analysis (GSA) methods for chemical treatment identification, for pharmacological mechanism elucidation, and for comparing compound toxicity profiles. METHODS: We created 30,211 chemical response-specific gene sets for human and mouse by next-gen TM, and derived 1,189 (human) and 588 (mouse) gene sets from the Comparative Toxicogenomics Database (CTD). We tested for significant differential expression (SDE) (false discovery rate -corrected p-values < 0.05) of the next-gen TM-derived gene sets and the CTD-derived gene sets in gene expression (GE) data sets of five chemicals (from experimental models). We tested for SDE of gene sets for six fibrates in a peroxisome proliferator-activated receptor alpha (PPARA) knock-out GE dataset and compared to results from the Connectivity Map. We tested for SDE of 319 next-gen TM-derived gene sets for environmental toxicants in three GE data sets of triazoles, and tested for SDE of 442 gene sets associated with embryonic structures. We compared the gene sets to triazole effects seen in the Whole Embryo Culture (WEC), and used principal component analysis (PCA) to discriminate triazoles from other chemicals. RESULTS: Next-gen TM-derived gene sets matching the chemical treatment were significantly altered in three GE data sets, and the corresponding CTD-derived gene sets were significantly altered in five GE data sets. Six next-gen TM-derived and four CTD-derived fibrate gene sets were significantly altered in the PPARA knock-out GE dataset. None of the fibrate signatures in cMap scored significant against the PPARA GE signature. 33 environmental toxicant gene sets were significantly altered in the triazole GE data sets. 21 of these toxicants had a similar toxicity pattern as the triazoles. We confirmed embryotoxic effects, and discriminated triazoles from other chemicals. CONCLUSIONS: Gene set analysis with next-gen TM-derived chemical response-specific gene sets is a scalable method for identifying similarities in gene responses to other chemicals, from which one may infer potential mode of action and/or toxic effect. BioMed Central 2013-01-29 /pmc/articles/PMC3572439/ /pubmed/23356878 http://dx.doi.org/10.1186/1755-8794-6-2 Text en Copyright ©2013 Hettne et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Hettne, Kristina M
Boorsma, André
van Dartel, Dorien A M
Goeman, Jelle J
de Jong, Esther
Piersma, Aldert H
Stierum, Rob H
Kleinjans, Jos C
Kors, Jan A
Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data
title Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data
title_full Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data
title_fullStr Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data
title_full_unstemmed Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data
title_short Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data
title_sort next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3572439/
https://www.ncbi.nlm.nih.gov/pubmed/23356878
http://dx.doi.org/10.1186/1755-8794-6-2
work_keys_str_mv AT hettnekristinam nextgenerationtextminingmediatedgenerationofchemicalresponsespecificgenesetsforinterpretationofgeneexpressiondata
AT boorsmaandre nextgenerationtextminingmediatedgenerationofchemicalresponsespecificgenesetsforinterpretationofgeneexpressiondata
AT vandarteldorienam nextgenerationtextminingmediatedgenerationofchemicalresponsespecificgenesetsforinterpretationofgeneexpressiondata
AT goemanjellej nextgenerationtextminingmediatedgenerationofchemicalresponsespecificgenesetsforinterpretationofgeneexpressiondata
AT dejongesther nextgenerationtextminingmediatedgenerationofchemicalresponsespecificgenesetsforinterpretationofgeneexpressiondata
AT piersmaalderth nextgenerationtextminingmediatedgenerationofchemicalresponsespecificgenesetsforinterpretationofgeneexpressiondata
AT stierumrobh nextgenerationtextminingmediatedgenerationofchemicalresponsespecificgenesetsforinterpretationofgeneexpressiondata
AT kleinjansjosc nextgenerationtextminingmediatedgenerationofchemicalresponsespecificgenesetsforinterpretationofgeneexpressiondata
AT korsjana nextgenerationtextminingmediatedgenerationofchemicalresponsespecificgenesetsforinterpretationofgeneexpressiondata