Cargando…

Asymmetric author-topic model for knowledge discovering of big data in toxicogenomics

The advancement of high-throughput screening technologies facilitates the generation of massive amount of biological data, a big data phenomena in biomedical science. Yet, researchers still heavily rely on keyword search and/or literature review to navigate the databases and analyses are often done...

Descripción completa

Detalles Bibliográficos
Autores principales: Chung, Ming-Hua, Wang, Yuping, Tang, Hailin, Zou, Wen, Basinger, John, Xu, Xiaowei, Tong, Weida
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4403303/
https://www.ncbi.nlm.nih.gov/pubmed/25941488
http://dx.doi.org/10.3389/fphar.2015.00081
_version_ 1782367323748827136
author Chung, Ming-Hua
Wang, Yuping
Tang, Hailin
Zou, Wen
Basinger, John
Xu, Xiaowei
Tong, Weida
author_facet Chung, Ming-Hua
Wang, Yuping
Tang, Hailin
Zou, Wen
Basinger, John
Xu, Xiaowei
Tong, Weida
author_sort Chung, Ming-Hua
collection PubMed
description The advancement of high-throughput screening technologies facilitates the generation of massive amount of biological data, a big data phenomena in biomedical science. Yet, researchers still heavily rely on keyword search and/or literature review to navigate the databases and analyses are often done in rather small-scale. As a result, the rich information of a database has not been fully utilized, particularly for the information embedded in the interactive nature between data points that are largely ignored and buried. For the past 10 years, probabilistic topic modeling has been recognized as an effective machine learning algorithm to annotate the hidden thematic structure of massive collection of documents. The analogy between text corpus and large-scale genomic data enables the application of text mining tools, like probabilistic topic models, to explore hidden patterns of genomic data and to the extension of altered biological functions. In this paper, we developed a generalized probabilistic topic model to analyze a toxicogenomics dataset that consists of a large number of gene expression data from the rat livers treated with drugs in multiple dose and time-points. We discovered the hidden patterns in gene expression associated with the effect of doses and time-points of treatment. Finally, we illustrated the ability of our model to identify the evidence of potential reduction of animal use.
format Online
Article
Text
id pubmed-4403303
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-44033032015-05-04 Asymmetric author-topic model for knowledge discovering of big data in toxicogenomics Chung, Ming-Hua Wang, Yuping Tang, Hailin Zou, Wen Basinger, John Xu, Xiaowei Tong, Weida Front Pharmacol Pharmacology The advancement of high-throughput screening technologies facilitates the generation of massive amount of biological data, a big data phenomena in biomedical science. Yet, researchers still heavily rely on keyword search and/or literature review to navigate the databases and analyses are often done in rather small-scale. As a result, the rich information of a database has not been fully utilized, particularly for the information embedded in the interactive nature between data points that are largely ignored and buried. For the past 10 years, probabilistic topic modeling has been recognized as an effective machine learning algorithm to annotate the hidden thematic structure of massive collection of documents. The analogy between text corpus and large-scale genomic data enables the application of text mining tools, like probabilistic topic models, to explore hidden patterns of genomic data and to the extension of altered biological functions. In this paper, we developed a generalized probabilistic topic model to analyze a toxicogenomics dataset that consists of a large number of gene expression data from the rat livers treated with drugs in multiple dose and time-points. We discovered the hidden patterns in gene expression associated with the effect of doses and time-points of treatment. Finally, we illustrated the ability of our model to identify the evidence of potential reduction of animal use. Frontiers Media S.A. 2015-04-20 /pmc/articles/PMC4403303/ /pubmed/25941488 http://dx.doi.org/10.3389/fphar.2015.00081 Text en Copyright © 2015 Chung, Wang, Tang, Zou, Basinger, Xu and Tong. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Pharmacology
Chung, Ming-Hua
Wang, Yuping
Tang, Hailin
Zou, Wen
Basinger, John
Xu, Xiaowei
Tong, Weida
Asymmetric author-topic model for knowledge discovering of big data in toxicogenomics
title Asymmetric author-topic model for knowledge discovering of big data in toxicogenomics
title_full Asymmetric author-topic model for knowledge discovering of big data in toxicogenomics
title_fullStr Asymmetric author-topic model for knowledge discovering of big data in toxicogenomics
title_full_unstemmed Asymmetric author-topic model for knowledge discovering of big data in toxicogenomics
title_short Asymmetric author-topic model for knowledge discovering of big data in toxicogenomics
title_sort asymmetric author-topic model for knowledge discovering of big data in toxicogenomics
topic Pharmacology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4403303/
https://www.ncbi.nlm.nih.gov/pubmed/25941488
http://dx.doi.org/10.3389/fphar.2015.00081
work_keys_str_mv AT chungminghua asymmetricauthortopicmodelforknowledgediscoveringofbigdataintoxicogenomics
AT wangyuping asymmetricauthortopicmodelforknowledgediscoveringofbigdataintoxicogenomics
AT tanghailin asymmetricauthortopicmodelforknowledgediscoveringofbigdataintoxicogenomics
AT zouwen asymmetricauthortopicmodelforknowledgediscoveringofbigdataintoxicogenomics
AT basingerjohn asymmetricauthortopicmodelforknowledgediscoveringofbigdataintoxicogenomics
AT xuxiaowei asymmetricauthortopicmodelforknowledgediscoveringofbigdataintoxicogenomics
AT tongweida asymmetricauthortopicmodelforknowledgediscoveringofbigdataintoxicogenomics