Cargando…

Robust Co-clustering to Discover Toxicogenomic Biomarkers and Their Regulatory Doses of Chemical Compounds Using Logistic Probabilistic Hidden Variable Model

Detection of biomarker genes and their regulatory doses of chemical compounds (DCCs) is one of the most important tasks in toxicogenomic studies as well as in drug design and development. There is an online computational platform “Toxygates” to identify biomarker genes and their regulatory DCCs by c...

Descripción completa

Detalles Bibliográficos
Autores principales: Hasan, Mohammad Nazmol, Rana, Md. Masud, Begum, Anjuman Ara, Rahman, Moizur, Mollah, Md. Nurul Haque
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6225736/
https://www.ncbi.nlm.nih.gov/pubmed/30450112
http://dx.doi.org/10.3389/fgene.2018.00516
_version_ 1783369845654421504
author Hasan, Mohammad Nazmol
Rana, Md. Masud
Begum, Anjuman Ara
Rahman, Moizur
Mollah, Md. Nurul Haque
author_facet Hasan, Mohammad Nazmol
Rana, Md. Masud
Begum, Anjuman Ara
Rahman, Moizur
Mollah, Md. Nurul Haque
author_sort Hasan, Mohammad Nazmol
collection PubMed
description Detection of biomarker genes and their regulatory doses of chemical compounds (DCCs) is one of the most important tasks in toxicogenomic studies as well as in drug design and development. There is an online computational platform “Toxygates” to identify biomarker genes and their regulatory DCCs by co-clustering approach. Nevertheless, the algorithm of that platform based on hierarchical clustering (HC) does not share gene-DCC two-way information simultaneously during co-clustering between genes and DCCs. Also it is sensitive to outlying observations. Thus, this platform may produce misleading results in some cases. The probabilistic hidden variable model (PHVM) is a more effective co-clustering approach that share two-way information simultaneously, but it is also sensitive to outlying observations. Therefore, in this paper we have proposed logistic probabilistic hidden variable model (LPHVM) for robust co-clustering between genes and DCCs, since gene expression data are often contaminated by outlying observations. We have investigated the performance of the proposed LPHVM co-clustering approach in a comparison with the conventional PHVM and Toxygates co-clustering approaches using simulated and real life TGP gene expression datasets, respectively. Simulation results show that the proposed method improved the performance over the conventional PHVM in presence of outliers; otherwise, it keeps equal performance. In the case of real life TGP data analysis, three DCCs (glibenclamide-low, perhexilline-low, and hexachlorobenzene-medium) for glutathione metabolism pathway dataset as well as two DCCs (acetaminophen-medium and methapyrilene-low) for PPAR signaling pathway dataset were incorrectly co-clustered by the Toxygates online platform, while only one DCC (hexachlorobenzene-low) for glutathione metabolism pathway was incorrectly co-clustered by the proposed LPHVM approach. Our findings from the real data analysis are also supported by the other findings in the literature.
format Online
Article
Text
id pubmed-6225736
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-62257362018-11-16 Robust Co-clustering to Discover Toxicogenomic Biomarkers and Their Regulatory Doses of Chemical Compounds Using Logistic Probabilistic Hidden Variable Model Hasan, Mohammad Nazmol Rana, Md. Masud Begum, Anjuman Ara Rahman, Moizur Mollah, Md. Nurul Haque Front Genet Genetics Detection of biomarker genes and their regulatory doses of chemical compounds (DCCs) is one of the most important tasks in toxicogenomic studies as well as in drug design and development. There is an online computational platform “Toxygates” to identify biomarker genes and their regulatory DCCs by co-clustering approach. Nevertheless, the algorithm of that platform based on hierarchical clustering (HC) does not share gene-DCC two-way information simultaneously during co-clustering between genes and DCCs. Also it is sensitive to outlying observations. Thus, this platform may produce misleading results in some cases. The probabilistic hidden variable model (PHVM) is a more effective co-clustering approach that share two-way information simultaneously, but it is also sensitive to outlying observations. Therefore, in this paper we have proposed logistic probabilistic hidden variable model (LPHVM) for robust co-clustering between genes and DCCs, since gene expression data are often contaminated by outlying observations. We have investigated the performance of the proposed LPHVM co-clustering approach in a comparison with the conventional PHVM and Toxygates co-clustering approaches using simulated and real life TGP gene expression datasets, respectively. Simulation results show that the proposed method improved the performance over the conventional PHVM in presence of outliers; otherwise, it keeps equal performance. In the case of real life TGP data analysis, three DCCs (glibenclamide-low, perhexilline-low, and hexachlorobenzene-medium) for glutathione metabolism pathway dataset as well as two DCCs (acetaminophen-medium and methapyrilene-low) for PPAR signaling pathway dataset were incorrectly co-clustered by the Toxygates online platform, while only one DCC (hexachlorobenzene-low) for glutathione metabolism pathway was incorrectly co-clustered by the proposed LPHVM approach. Our findings from the real data analysis are also supported by the other findings in the literature. Frontiers Media S.A. 2018-11-01 /pmc/articles/PMC6225736/ /pubmed/30450112 http://dx.doi.org/10.3389/fgene.2018.00516 Text en Copyright © 2018 Hasan, Rana, Begum, Rahman and Mollah. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Hasan, Mohammad Nazmol
Rana, Md. Masud
Begum, Anjuman Ara
Rahman, Moizur
Mollah, Md. Nurul Haque
Robust Co-clustering to Discover Toxicogenomic Biomarkers and Their Regulatory Doses of Chemical Compounds Using Logistic Probabilistic Hidden Variable Model
title Robust Co-clustering to Discover Toxicogenomic Biomarkers and Their Regulatory Doses of Chemical Compounds Using Logistic Probabilistic Hidden Variable Model
title_full Robust Co-clustering to Discover Toxicogenomic Biomarkers and Their Regulatory Doses of Chemical Compounds Using Logistic Probabilistic Hidden Variable Model
title_fullStr Robust Co-clustering to Discover Toxicogenomic Biomarkers and Their Regulatory Doses of Chemical Compounds Using Logistic Probabilistic Hidden Variable Model
title_full_unstemmed Robust Co-clustering to Discover Toxicogenomic Biomarkers and Their Regulatory Doses of Chemical Compounds Using Logistic Probabilistic Hidden Variable Model
title_short Robust Co-clustering to Discover Toxicogenomic Biomarkers and Their Regulatory Doses of Chemical Compounds Using Logistic Probabilistic Hidden Variable Model
title_sort robust co-clustering to discover toxicogenomic biomarkers and their regulatory doses of chemical compounds using logistic probabilistic hidden variable model
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6225736/
https://www.ncbi.nlm.nih.gov/pubmed/30450112
http://dx.doi.org/10.3389/fgene.2018.00516
work_keys_str_mv AT hasanmohammadnazmol robustcoclusteringtodiscovertoxicogenomicbiomarkersandtheirregulatorydosesofchemicalcompoundsusinglogisticprobabilistichiddenvariablemodel
AT ranamdmasud robustcoclusteringtodiscovertoxicogenomicbiomarkersandtheirregulatorydosesofchemicalcompoundsusinglogisticprobabilistichiddenvariablemodel
AT begumanjumanara robustcoclusteringtodiscovertoxicogenomicbiomarkersandtheirregulatorydosesofchemicalcompoundsusinglogisticprobabilistichiddenvariablemodel
AT rahmanmoizur robustcoclusteringtodiscovertoxicogenomicbiomarkersandtheirregulatorydosesofchemicalcompoundsusinglogisticprobabilistichiddenvariablemodel
AT mollahmdnurulhaque robustcoclusteringtodiscovertoxicogenomicbiomarkersandtheirregulatorydosesofchemicalcompoundsusinglogisticprobabilistichiddenvariablemodel