Cargando…

Prioritizing biological pathways by recognizing context in time-series gene expression data

BACKGROUND: The primary goal of pathway analysis using transcriptome data is to find significantly perturbed pathways. However, pathway analysis is not always successful in identifying pathways that are truly relevant to the context under study. A major reason for this difficulty is that a single ge...

Descripción completa

Detalles Bibliográficos
Autores principales: Lee, Jusang, Jo, Kyuri, Lee, Sunwon, Kang, Jaewoo, Kim, Sun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5259824/
https://www.ncbi.nlm.nih.gov/pubmed/28155707
http://dx.doi.org/10.1186/s12859-016-1335-8
_version_ 1782499281103486976
author Lee, Jusang
Jo, Kyuri
Lee, Sunwon
Kang, Jaewoo
Kim, Sun
author_facet Lee, Jusang
Jo, Kyuri
Lee, Sunwon
Kang, Jaewoo
Kim, Sun
author_sort Lee, Jusang
collection PubMed
description BACKGROUND: The primary goal of pathway analysis using transcriptome data is to find significantly perturbed pathways. However, pathway analysis is not always successful in identifying pathways that are truly relevant to the context under study. A major reason for this difficulty is that a single gene is involved in multiple pathways. In the KEGG pathway database, there are 146 genes, each of which is involved in more than 20 pathways. Thus activation of even a single gene will result in activation of many pathways. This complex relationship often makes the pathway analysis very difficult. While we need much more powerful pathway analysis methods, a readily available alternative way is to incorporate the literature information. RESULTS: In this study, we propose a novel approach for prioritizing pathways by combining results from both pathway analysis tools and literature information. The basic idea is as follows. Whenever there are enough articles that provide evidence on which pathways are relevant to the context, we can be assured that the pathways are indeed related to the context, which is termed as relevance in this paper. However, if there are few or no articles reported, then we should rely on the results from the pathway analysis tools, which is termed as significance in this paper. We realized this concept as an algorithm by introducing Context Score and Impact Score and then combining the two into a single score. Our method ranked truly relevant pathways significantly higher than existing pathway analysis tools in experiments with two data sets. CONCLUSIONS: Our novel framework was implemented as ContextTRAP by utilizing two existing tools, TRAP and BEST. ContextTRAP will be a useful tool for the pathway based analysis of gene expression data since the user can specify the context of the biological experiment in a set of keywords. The web version of ContextTRAP is available at http://biohealth.snu.ac.kr/software/contextTRAP. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1335-8) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5259824
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-52598242017-01-26 Prioritizing biological pathways by recognizing context in time-series gene expression data Lee, Jusang Jo, Kyuri Lee, Sunwon Kang, Jaewoo Kim, Sun BMC Bioinformatics Research BACKGROUND: The primary goal of pathway analysis using transcriptome data is to find significantly perturbed pathways. However, pathway analysis is not always successful in identifying pathways that are truly relevant to the context under study. A major reason for this difficulty is that a single gene is involved in multiple pathways. In the KEGG pathway database, there are 146 genes, each of which is involved in more than 20 pathways. Thus activation of even a single gene will result in activation of many pathways. This complex relationship often makes the pathway analysis very difficult. While we need much more powerful pathway analysis methods, a readily available alternative way is to incorporate the literature information. RESULTS: In this study, we propose a novel approach for prioritizing pathways by combining results from both pathway analysis tools and literature information. The basic idea is as follows. Whenever there are enough articles that provide evidence on which pathways are relevant to the context, we can be assured that the pathways are indeed related to the context, which is termed as relevance in this paper. However, if there are few or no articles reported, then we should rely on the results from the pathway analysis tools, which is termed as significance in this paper. We realized this concept as an algorithm by introducing Context Score and Impact Score and then combining the two into a single score. Our method ranked truly relevant pathways significantly higher than existing pathway analysis tools in experiments with two data sets. CONCLUSIONS: Our novel framework was implemented as ContextTRAP by utilizing two existing tools, TRAP and BEST. ContextTRAP will be a useful tool for the pathway based analysis of gene expression data since the user can specify the context of the biological experiment in a set of keywords. The web version of ContextTRAP is available at http://biohealth.snu.ac.kr/software/contextTRAP. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1335-8) contains supplementary material, which is available to authorized users. BioMed Central 2016-12-23 /pmc/articles/PMC5259824/ /pubmed/28155707 http://dx.doi.org/10.1186/s12859-016-1335-8 Text en © The Author(s) 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Lee, Jusang
Jo, Kyuri
Lee, Sunwon
Kang, Jaewoo
Kim, Sun
Prioritizing biological pathways by recognizing context in time-series gene expression data
title Prioritizing biological pathways by recognizing context in time-series gene expression data
title_full Prioritizing biological pathways by recognizing context in time-series gene expression data
title_fullStr Prioritizing biological pathways by recognizing context in time-series gene expression data
title_full_unstemmed Prioritizing biological pathways by recognizing context in time-series gene expression data
title_short Prioritizing biological pathways by recognizing context in time-series gene expression data
title_sort prioritizing biological pathways by recognizing context in time-series gene expression data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5259824/
https://www.ncbi.nlm.nih.gov/pubmed/28155707
http://dx.doi.org/10.1186/s12859-016-1335-8
work_keys_str_mv AT leejusang prioritizingbiologicalpathwaysbyrecognizingcontextintimeseriesgeneexpressiondata
AT jokyuri prioritizingbiologicalpathwaysbyrecognizingcontextintimeseriesgeneexpressiondata
AT leesunwon prioritizingbiologicalpathwaysbyrecognizingcontextintimeseriesgeneexpressiondata
AT kangjaewoo prioritizingbiologicalpathwaysbyrecognizingcontextintimeseriesgeneexpressiondata
AT kimsun prioritizingbiologicalpathwaysbyrecognizingcontextintimeseriesgeneexpressiondata