Cargando…

A partially function-to-topic model for protein function prediction

BACKGROUND: Proteins are a kind of macromolecules and the main component of a cell, and thus it is the most essential and versatile material of life. The research of protein functions is of great significance in decoding the secret of life. In recent years, researchers have introduced multi-label su...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Lin, Tang, Lin, Tang, Mingjing, Zhou, Wei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6311909/
https://www.ncbi.nlm.nih.gov/pubmed/30598098
http://dx.doi.org/10.1186/s12864-018-5276-7
_version_ 1783383698988597248
author Liu, Lin
Tang, Lin
Tang, Mingjing
Zhou, Wei
author_facet Liu, Lin
Tang, Lin
Tang, Mingjing
Zhou, Wei
author_sort Liu, Lin
collection PubMed
description BACKGROUND: Proteins are a kind of macromolecules and the main component of a cell, and thus it is the most essential and versatile material of life. The research of protein functions is of great significance in decoding the secret of life. In recent years, researchers have introduced multi-label supervised topic model such as Labeled Latent Dirichlet Allocation (Labeled-LDA) into protein function prediction, which can obtain more accurate and explanatory prediction. However, the topic-label corresponding way of Labeled-LDA is associating each label (GO term) with a corresponding topic directly, which makes the latent topics to be completely degenerated, and ignores the differences between labels and latent topics. RESULT: To achieve more accurate probabilistic modeling of function label, we propose a Partially Function-to-Topic Prediction (PFTP) model for introducing the local topics subset corresponding to each function label. Meanwhile, PFTP not only supports latent topics subset within a given function label but also a background topic corresponding to a ‘fake’ function label, which represents common semantic of protein function. Related definitions and the topic modeling process of PFTP are described in this paper. In a 5-fold cross validation experiment on yeast and human datasets, PFTP significantly outperforms five widely adopted methods for protein function prediction. Meanwhile, the impact of model parameters on prediction performance and the latent topics discovered by PFTP are also discussed in this paper. CONCLUSION: All of the experimental results provide evidence that PFTP is effective and have potential value for predicting protein function. Based on its ability of discovering more-refined latent sub-structure of function label, we can anticipate that PFTP is a potential method to reveal a deeper biological explanation for protein functions.
format Online
Article
Text
id pubmed-6311909
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-63119092019-01-07 A partially function-to-topic model for protein function prediction Liu, Lin Tang, Lin Tang, Mingjing Zhou, Wei BMC Genomics Research BACKGROUND: Proteins are a kind of macromolecules and the main component of a cell, and thus it is the most essential and versatile material of life. The research of protein functions is of great significance in decoding the secret of life. In recent years, researchers have introduced multi-label supervised topic model such as Labeled Latent Dirichlet Allocation (Labeled-LDA) into protein function prediction, which can obtain more accurate and explanatory prediction. However, the topic-label corresponding way of Labeled-LDA is associating each label (GO term) with a corresponding topic directly, which makes the latent topics to be completely degenerated, and ignores the differences between labels and latent topics. RESULT: To achieve more accurate probabilistic modeling of function label, we propose a Partially Function-to-Topic Prediction (PFTP) model for introducing the local topics subset corresponding to each function label. Meanwhile, PFTP not only supports latent topics subset within a given function label but also a background topic corresponding to a ‘fake’ function label, which represents common semantic of protein function. Related definitions and the topic modeling process of PFTP are described in this paper. In a 5-fold cross validation experiment on yeast and human datasets, PFTP significantly outperforms five widely adopted methods for protein function prediction. Meanwhile, the impact of model parameters on prediction performance and the latent topics discovered by PFTP are also discussed in this paper. CONCLUSION: All of the experimental results provide evidence that PFTP is effective and have potential value for predicting protein function. Based on its ability of discovering more-refined latent sub-structure of function label, we can anticipate that PFTP is a potential method to reveal a deeper biological explanation for protein functions. BioMed Central 2018-12-31 /pmc/articles/PMC6311909/ /pubmed/30598098 http://dx.doi.org/10.1186/s12864-018-5276-7 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Liu, Lin
Tang, Lin
Tang, Mingjing
Zhou, Wei
A partially function-to-topic model for protein function prediction
title A partially function-to-topic model for protein function prediction
title_full A partially function-to-topic model for protein function prediction
title_fullStr A partially function-to-topic model for protein function prediction
title_full_unstemmed A partially function-to-topic model for protein function prediction
title_short A partially function-to-topic model for protein function prediction
title_sort partially function-to-topic model for protein function prediction
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6311909/
https://www.ncbi.nlm.nih.gov/pubmed/30598098
http://dx.doi.org/10.1186/s12864-018-5276-7
work_keys_str_mv AT liulin apartiallyfunctiontotopicmodelforproteinfunctionprediction
AT tanglin apartiallyfunctiontotopicmodelforproteinfunctionprediction
AT tangmingjing apartiallyfunctiontotopicmodelforproteinfunctionprediction
AT zhouwei apartiallyfunctiontotopicmodelforproteinfunctionprediction
AT liulin partiallyfunctiontotopicmodelforproteinfunctionprediction
AT tanglin partiallyfunctiontotopicmodelforproteinfunctionprediction
AT tangmingjing partiallyfunctiontotopicmodelforproteinfunctionprediction
AT zhouwei partiallyfunctiontotopicmodelforproteinfunctionprediction