Cargando…
Sequence-based identification of interface residues by an integrative profile combining hydrophobic and evolutionary information
BACKGROUND: Protein-protein interactions play essential roles in protein function determination and drug design. Numerous methods have been proposed to recognize their interaction sites, however, only a small proportion of protein complexes have been successfully resolved due to the high cost. There...
Autores principales: | , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2921408/ https://www.ncbi.nlm.nih.gov/pubmed/20667087 http://dx.doi.org/10.1186/1471-2105-11-402 |
_version_ | 1782185390494449664 |
---|---|
author | Chen, Peng Li, Jinyan |
author_facet | Chen, Peng Li, Jinyan |
author_sort | Chen, Peng |
collection | PubMed |
description | BACKGROUND: Protein-protein interactions play essential roles in protein function determination and drug design. Numerous methods have been proposed to recognize their interaction sites, however, only a small proportion of protein complexes have been successfully resolved due to the high cost. Therefore, it is important to improve the performance for predicting protein interaction sites based on primary sequence alone. RESULTS: We propose a new idea to construct an integrative profile for each residue in a protein by combining its hydrophobic and evolutionary information. A support vector machine (SVM) ensemble is then developed, where SVMs train on different pairs of positive (interface sites) and negative (non-interface sites) subsets. The subsets having roughly the same sizes are grouped in the order of accessible surface area change before and after complexation. A self-organizing map (SOM) technique is applied to group similar input vectors to make more accurate the identification of interface residues. An ensemble of ten-SVMs achieves an MCC improvement by around 8% and F1 improvement by around 9% over that of three-SVMs. As expected, SVM ensembles constantly perform better than individual SVMs. In addition, the model by the integrative profiles outperforms that based on the sequence profile or the hydropathy scale alone. As our method uses a small number of features to encode the input vectors, our model is simpler, faster and more accurate than the existing methods. CONCLUSIONS: The integrative profile by combining hydrophobic and evolutionary information contributes most to the protein-protein interaction prediction. Results show that evolutionary context of residue with respect to hydrophobicity makes better the identification of protein interface residues. In addition, the ensemble of SVM classifiers improves the prediction performance. AVAILABILITY: Datasets and software are available at http://mail.ustc.edu.cn/~bigeagle/BMCBioinfo2010/index.htm. |
format | Text |
id | pubmed-2921408 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-29214082010-08-14 Sequence-based identification of interface residues by an integrative profile combining hydrophobic and evolutionary information Chen, Peng Li, Jinyan BMC Bioinformatics Research Article BACKGROUND: Protein-protein interactions play essential roles in protein function determination and drug design. Numerous methods have been proposed to recognize their interaction sites, however, only a small proportion of protein complexes have been successfully resolved due to the high cost. Therefore, it is important to improve the performance for predicting protein interaction sites based on primary sequence alone. RESULTS: We propose a new idea to construct an integrative profile for each residue in a protein by combining its hydrophobic and evolutionary information. A support vector machine (SVM) ensemble is then developed, where SVMs train on different pairs of positive (interface sites) and negative (non-interface sites) subsets. The subsets having roughly the same sizes are grouped in the order of accessible surface area change before and after complexation. A self-organizing map (SOM) technique is applied to group similar input vectors to make more accurate the identification of interface residues. An ensemble of ten-SVMs achieves an MCC improvement by around 8% and F1 improvement by around 9% over that of three-SVMs. As expected, SVM ensembles constantly perform better than individual SVMs. In addition, the model by the integrative profiles outperforms that based on the sequence profile or the hydropathy scale alone. As our method uses a small number of features to encode the input vectors, our model is simpler, faster and more accurate than the existing methods. CONCLUSIONS: The integrative profile by combining hydrophobic and evolutionary information contributes most to the protein-protein interaction prediction. Results show that evolutionary context of residue with respect to hydrophobicity makes better the identification of protein interface residues. In addition, the ensemble of SVM classifiers improves the prediction performance. AVAILABILITY: Datasets and software are available at http://mail.ustc.edu.cn/~bigeagle/BMCBioinfo2010/index.htm. BioMed Central 2010-07-28 /pmc/articles/PMC2921408/ /pubmed/20667087 http://dx.doi.org/10.1186/1471-2105-11-402 Text en Copyright ©2010 Chen and Li; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Chen, Peng Li, Jinyan Sequence-based identification of interface residues by an integrative profile combining hydrophobic and evolutionary information |
title | Sequence-based identification of interface residues by an integrative profile combining hydrophobic and evolutionary information |
title_full | Sequence-based identification of interface residues by an integrative profile combining hydrophobic and evolutionary information |
title_fullStr | Sequence-based identification of interface residues by an integrative profile combining hydrophobic and evolutionary information |
title_full_unstemmed | Sequence-based identification of interface residues by an integrative profile combining hydrophobic and evolutionary information |
title_short | Sequence-based identification of interface residues by an integrative profile combining hydrophobic and evolutionary information |
title_sort | sequence-based identification of interface residues by an integrative profile combining hydrophobic and evolutionary information |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2921408/ https://www.ncbi.nlm.nih.gov/pubmed/20667087 http://dx.doi.org/10.1186/1471-2105-11-402 |
work_keys_str_mv | AT chenpeng sequencebasedidentificationofinterfaceresiduesbyanintegrativeprofilecombininghydrophobicandevolutionaryinformation AT lijinyan sequencebasedidentificationofinterfaceresiduesbyanintegrativeprofilecombininghydrophobicandevolutionaryinformation |