Cargando…

Application of dynamic topic models to toxicogenomics data

BACKGROUND: All biological processes are inherently dynamic. Biological systems evolve transiently or sustainably according to sequential time points after perturbation by environment insults, drugs and chemicals. Investigating the temporal behavior of molecular events has been an important subject...

Descripción completa

Detalles Bibliográficos
Autores principales: Lee, Mikyung, Liu, Zhichao, Huang, Ruili, Tong, Weida
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5073961/
https://www.ncbi.nlm.nih.gov/pubmed/27766956
http://dx.doi.org/10.1186/s12859-016-1225-0
_version_ 1782461666844213248
author Lee, Mikyung
Liu, Zhichao
Huang, Ruili
Tong, Weida
author_facet Lee, Mikyung
Liu, Zhichao
Huang, Ruili
Tong, Weida
author_sort Lee, Mikyung
collection PubMed
description BACKGROUND: All biological processes are inherently dynamic. Biological systems evolve transiently or sustainably according to sequential time points after perturbation by environment insults, drugs and chemicals. Investigating the temporal behavior of molecular events has been an important subject to understand the underlying mechanisms governing the biological system in response to, such as, drug treatment. The intrinsic complexity of time series data requires appropriate computational algorithms for data interpretation. In this study, we propose, for the first time, the application of dynamic topic models (DTM) for analyzing time-series gene expression data. RESULTS: A large time-series toxicogenomics dataset was studied. It contains over 3144 microarrays of gene expression data corresponding to rat livers treated with 131 compounds (most are drugs) at two doses (control and high dose) in a repeated schedule containing four separate time points (4-, 8-, 15- and 29-day). We analyzed, with DTM, the topics (consisting of a set of genes) and their biological interpretations over these four time points. We identified hidden patterns embedded in this time-series gene expression profiles. From the topic distribution for compound-time condition, a number of drugs were successfully clustered by their shared mode-of-action such as PPARɑ agonists and COX inhibitors. The biological meaning underlying each topic was interpreted using diverse sources of information such as functional analysis of the pathways and therapeutic uses of the drugs. Additionally, we found that sample clusters produced by DTM are much more coherent in terms of functional categories when compared to traditional clustering algorithms. CONCLUSIONS: We demonstrated that DTM, a text mining technique, can be a powerful computational approach for clustering time-series gene expression profiles with the probabilistic representation of their dynamic features along sequential time frames. The method offers an alternative way for uncovering hidden patterns embedded in time series gene expression profiles to gain enhanced understanding of dynamic behavior of gene regulation in the biological system. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1225-0) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5073961
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-50739612016-10-27 Application of dynamic topic models to toxicogenomics data Lee, Mikyung Liu, Zhichao Huang, Ruili Tong, Weida BMC Bioinformatics Proceedings BACKGROUND: All biological processes are inherently dynamic. Biological systems evolve transiently or sustainably according to sequential time points after perturbation by environment insults, drugs and chemicals. Investigating the temporal behavior of molecular events has been an important subject to understand the underlying mechanisms governing the biological system in response to, such as, drug treatment. The intrinsic complexity of time series data requires appropriate computational algorithms for data interpretation. In this study, we propose, for the first time, the application of dynamic topic models (DTM) for analyzing time-series gene expression data. RESULTS: A large time-series toxicogenomics dataset was studied. It contains over 3144 microarrays of gene expression data corresponding to rat livers treated with 131 compounds (most are drugs) at two doses (control and high dose) in a repeated schedule containing four separate time points (4-, 8-, 15- and 29-day). We analyzed, with DTM, the topics (consisting of a set of genes) and their biological interpretations over these four time points. We identified hidden patterns embedded in this time-series gene expression profiles. From the topic distribution for compound-time condition, a number of drugs were successfully clustered by their shared mode-of-action such as PPARɑ agonists and COX inhibitors. The biological meaning underlying each topic was interpreted using diverse sources of information such as functional analysis of the pathways and therapeutic uses of the drugs. Additionally, we found that sample clusters produced by DTM are much more coherent in terms of functional categories when compared to traditional clustering algorithms. CONCLUSIONS: We demonstrated that DTM, a text mining technique, can be a powerful computational approach for clustering time-series gene expression profiles with the probabilistic representation of their dynamic features along sequential time frames. The method offers an alternative way for uncovering hidden patterns embedded in time series gene expression profiles to gain enhanced understanding of dynamic behavior of gene regulation in the biological system. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1225-0) contains supplementary material, which is available to authorized users. BioMed Central 2016-10-06 /pmc/articles/PMC5073961/ /pubmed/27766956 http://dx.doi.org/10.1186/s12859-016-1225-0 Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Proceedings
Lee, Mikyung
Liu, Zhichao
Huang, Ruili
Tong, Weida
Application of dynamic topic models to toxicogenomics data
title Application of dynamic topic models to toxicogenomics data
title_full Application of dynamic topic models to toxicogenomics data
title_fullStr Application of dynamic topic models to toxicogenomics data
title_full_unstemmed Application of dynamic topic models to toxicogenomics data
title_short Application of dynamic topic models to toxicogenomics data
title_sort application of dynamic topic models to toxicogenomics data
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5073961/
https://www.ncbi.nlm.nih.gov/pubmed/27766956
http://dx.doi.org/10.1186/s12859-016-1225-0
work_keys_str_mv AT leemikyung applicationofdynamictopicmodelstotoxicogenomicsdata
AT liuzhichao applicationofdynamictopicmodelstotoxicogenomicsdata
AT huangruili applicationofdynamictopicmodelstotoxicogenomicsdata
AT tongweida applicationofdynamictopicmodelstotoxicogenomicsdata