Cargando…
Exploring Mouse Protein Function via Multiple Approaches
Although the number of available protein sequences is growing exponentially, functional protein annotations lag far behind. Therefore, accurate identification of protein functions remains one of the major challenges in molecular biology. In this study, we presented a novel approach to predict mouse...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5112993/ https://www.ncbi.nlm.nih.gov/pubmed/27846315 http://dx.doi.org/10.1371/journal.pone.0166580 |
_version_ | 1782468119267115008 |
---|---|
author | Huang, Guohua Chu, Chen Huang, Tao Kong, Xiangyin Zhang, Yunhua Zhang, Ning Cai, Yu-Dong |
author_facet | Huang, Guohua Chu, Chen Huang, Tao Kong, Xiangyin Zhang, Yunhua Zhang, Ning Cai, Yu-Dong |
author_sort | Huang, Guohua |
collection | PubMed |
description | Although the number of available protein sequences is growing exponentially, functional protein annotations lag far behind. Therefore, accurate identification of protein functions remains one of the major challenges in molecular biology. In this study, we presented a novel approach to predict mouse protein functions. The approach was a sequential combination of a similarity-based approach, an interaction-based approach and a pseudo amino acid composition-based approach. The method achieved an accuracy of about 0.8450 for the 1(st)-order predictions in the leave-one-out and ten-fold cross-validations. For the results yielded by the leave-one-out cross-validation, although the similarity-based approach alone achieved an accuracy of 0.8756, it was unable to predict the functions of proteins with no homologues. Comparatively, the pseudo amino acid composition-based approach alone reached an accuracy of 0.6786. Although the accuracy was lower than that of the previous approach, it could predict the functions of almost all proteins, even proteins with no homologues. Therefore, the combined method balanced the advantages and disadvantages of both approaches to achieve efficient performance. Furthermore, the results yielded by the ten-fold cross-validation indicate that the combined method is still effective and stable when there are no close homologs are available. However, the accuracy of the predicted functions can only be determined according to known protein functions based on current knowledge. Many protein functions remain unknown. By exploring the functions of proteins for which the 1(st)-order predicted functions are wrong but the 2(nd)-order predicted functions are correct, the 1(st)-order wrongly predicted functions were shown to be closely associated with the genes encoding the proteins. The so-called wrongly predicted functions could also potentially be correct upon future experimental verification. Therefore, the accuracy of the presented method may be much higher in reality. |
format | Online Article Text |
id | pubmed-5112993 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-51129932016-12-08 Exploring Mouse Protein Function via Multiple Approaches Huang, Guohua Chu, Chen Huang, Tao Kong, Xiangyin Zhang, Yunhua Zhang, Ning Cai, Yu-Dong PLoS One Research Article Although the number of available protein sequences is growing exponentially, functional protein annotations lag far behind. Therefore, accurate identification of protein functions remains one of the major challenges in molecular biology. In this study, we presented a novel approach to predict mouse protein functions. The approach was a sequential combination of a similarity-based approach, an interaction-based approach and a pseudo amino acid composition-based approach. The method achieved an accuracy of about 0.8450 for the 1(st)-order predictions in the leave-one-out and ten-fold cross-validations. For the results yielded by the leave-one-out cross-validation, although the similarity-based approach alone achieved an accuracy of 0.8756, it was unable to predict the functions of proteins with no homologues. Comparatively, the pseudo amino acid composition-based approach alone reached an accuracy of 0.6786. Although the accuracy was lower than that of the previous approach, it could predict the functions of almost all proteins, even proteins with no homologues. Therefore, the combined method balanced the advantages and disadvantages of both approaches to achieve efficient performance. Furthermore, the results yielded by the ten-fold cross-validation indicate that the combined method is still effective and stable when there are no close homologs are available. However, the accuracy of the predicted functions can only be determined according to known protein functions based on current knowledge. Many protein functions remain unknown. By exploring the functions of proteins for which the 1(st)-order predicted functions are wrong but the 2(nd)-order predicted functions are correct, the 1(st)-order wrongly predicted functions were shown to be closely associated with the genes encoding the proteins. The so-called wrongly predicted functions could also potentially be correct upon future experimental verification. Therefore, the accuracy of the presented method may be much higher in reality. Public Library of Science 2016-11-15 /pmc/articles/PMC5112993/ /pubmed/27846315 http://dx.doi.org/10.1371/journal.pone.0166580 Text en © 2016 Huang et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Huang, Guohua Chu, Chen Huang, Tao Kong, Xiangyin Zhang, Yunhua Zhang, Ning Cai, Yu-Dong Exploring Mouse Protein Function via Multiple Approaches |
title | Exploring Mouse Protein Function via Multiple Approaches |
title_full | Exploring Mouse Protein Function via Multiple Approaches |
title_fullStr | Exploring Mouse Protein Function via Multiple Approaches |
title_full_unstemmed | Exploring Mouse Protein Function via Multiple Approaches |
title_short | Exploring Mouse Protein Function via Multiple Approaches |
title_sort | exploring mouse protein function via multiple approaches |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5112993/ https://www.ncbi.nlm.nih.gov/pubmed/27846315 http://dx.doi.org/10.1371/journal.pone.0166580 |
work_keys_str_mv | AT huangguohua exploringmouseproteinfunctionviamultipleapproaches AT chuchen exploringmouseproteinfunctionviamultipleapproaches AT huangtao exploringmouseproteinfunctionviamultipleapproaches AT kongxiangyin exploringmouseproteinfunctionviamultipleapproaches AT zhangyunhua exploringmouseproteinfunctionviamultipleapproaches AT zhangning exploringmouseproteinfunctionviamultipleapproaches AT caiyudong exploringmouseproteinfunctionviamultipleapproaches |