Cargando…
A Protein Interaction Information-based Generative Model for Enhancing Gene Clustering
In the field of computational bioinformatics, identifying a set of genes which are responsible for a particular cellular mechanism, is very much essential for tasks such as medical diagnosis or disease gene identification. Accurately grouping (clustering) the genes is one of the important tasks in u...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6971242/ https://www.ncbi.nlm.nih.gov/pubmed/31959782 http://dx.doi.org/10.1038/s41598-020-57437-5 |
_version_ | 1783489683042336768 |
---|---|
author | Dutta, Pratik Saha, Sriparna Pai, Sanket Kumar, Aviral |
author_facet | Dutta, Pratik Saha, Sriparna Pai, Sanket Kumar, Aviral |
author_sort | Dutta, Pratik |
collection | PubMed |
description | In the field of computational bioinformatics, identifying a set of genes which are responsible for a particular cellular mechanism, is very much essential for tasks such as medical diagnosis or disease gene identification. Accurately grouping (clustering) the genes is one of the important tasks in understanding the functionalities of the disease genes. In this regard, ensemble clustering becomes a promising approach to combine different clustering solutions to generate almost accurate gene partitioning. Recently, researchers have used generative model as a smart ensemble method to produce the right consensus solution. In the current paper, we develop a protein-protein interaction-based generative model that can efficiently perform a gene clustering. Utilizing protein interaction information as the generative model’s latent variable enables enhance the generative model’s efficiency in inferring final probabilistic labels. The proposed generative model utilizes different weak supervision sources rather utilizing any ground truth information. For weak supervision sources, we use a multi-objective optimization based clustering technique together with the world’s largest gene ontology based knowledge-base named Gene Ontology Consortium(GOC). These weakly supervised labels are supplied to a generative model that eventually assigns all genes to probabilistic labels. The comparative study with respect to silhouette score, Biological Homogeneity Index (BHI) and Biological Stability Index (BSI) proves that the proposed generative model outperforms than other state-of-the-art techniques. |
format | Online Article Text |
id | pubmed-6971242 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-69712422020-01-27 A Protein Interaction Information-based Generative Model for Enhancing Gene Clustering Dutta, Pratik Saha, Sriparna Pai, Sanket Kumar, Aviral Sci Rep Article In the field of computational bioinformatics, identifying a set of genes which are responsible for a particular cellular mechanism, is very much essential for tasks such as medical diagnosis or disease gene identification. Accurately grouping (clustering) the genes is one of the important tasks in understanding the functionalities of the disease genes. In this regard, ensemble clustering becomes a promising approach to combine different clustering solutions to generate almost accurate gene partitioning. Recently, researchers have used generative model as a smart ensemble method to produce the right consensus solution. In the current paper, we develop a protein-protein interaction-based generative model that can efficiently perform a gene clustering. Utilizing protein interaction information as the generative model’s latent variable enables enhance the generative model’s efficiency in inferring final probabilistic labels. The proposed generative model utilizes different weak supervision sources rather utilizing any ground truth information. For weak supervision sources, we use a multi-objective optimization based clustering technique together with the world’s largest gene ontology based knowledge-base named Gene Ontology Consortium(GOC). These weakly supervised labels are supplied to a generative model that eventually assigns all genes to probabilistic labels. The comparative study with respect to silhouette score, Biological Homogeneity Index (BHI) and Biological Stability Index (BSI) proves that the proposed generative model outperforms than other state-of-the-art techniques. Nature Publishing Group UK 2020-01-20 /pmc/articles/PMC6971242/ /pubmed/31959782 http://dx.doi.org/10.1038/s41598-020-57437-5 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Dutta, Pratik Saha, Sriparna Pai, Sanket Kumar, Aviral A Protein Interaction Information-based Generative Model for Enhancing Gene Clustering |
title | A Protein Interaction Information-based Generative Model for Enhancing Gene Clustering |
title_full | A Protein Interaction Information-based Generative Model for Enhancing Gene Clustering |
title_fullStr | A Protein Interaction Information-based Generative Model for Enhancing Gene Clustering |
title_full_unstemmed | A Protein Interaction Information-based Generative Model for Enhancing Gene Clustering |
title_short | A Protein Interaction Information-based Generative Model for Enhancing Gene Clustering |
title_sort | protein interaction information-based generative model for enhancing gene clustering |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6971242/ https://www.ncbi.nlm.nih.gov/pubmed/31959782 http://dx.doi.org/10.1038/s41598-020-57437-5 |
work_keys_str_mv | AT duttapratik aproteininteractioninformationbasedgenerativemodelforenhancinggeneclustering AT sahasriparna aproteininteractioninformationbasedgenerativemodelforenhancinggeneclustering AT paisanket aproteininteractioninformationbasedgenerativemodelforenhancinggeneclustering AT kumaraviral aproteininteractioninformationbasedgenerativemodelforenhancinggeneclustering AT duttapratik proteininteractioninformationbasedgenerativemodelforenhancinggeneclustering AT sahasriparna proteininteractioninformationbasedgenerativemodelforenhancinggeneclustering AT paisanket proteininteractioninformationbasedgenerativemodelforenhancinggeneclustering AT kumaraviral proteininteractioninformationbasedgenerativemodelforenhancinggeneclustering |