Cargando…

DoBo: Protein domain boundary prediction by integrating evolutionary signals and machine learning

BACKGROUND: Accurate identification of protein domain boundaries is useful for protein structure determination and prediction. However, predicting protein domain boundaries from a sequence is still very challenging and largely unsolved. RESULTS: We developed a new method to integrate the classificat...

Descripción completa

Detalles Bibliográficos
Autores principales: Eickholt, Jesse, Deng, Xin, Cheng, Jianlin
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3036623/
https://www.ncbi.nlm.nih.gov/pubmed/21284866
http://dx.doi.org/10.1186/1471-2105-12-43
_version_ 1782197876710965248
author Eickholt, Jesse
Deng, Xin
Cheng, Jianlin
author_facet Eickholt, Jesse
Deng, Xin
Cheng, Jianlin
author_sort Eickholt, Jesse
collection PubMed
description BACKGROUND: Accurate identification of protein domain boundaries is useful for protein structure determination and prediction. However, predicting protein domain boundaries from a sequence is still very challenging and largely unsolved. RESULTS: We developed a new method to integrate the classification power of machine learning with evolutionary signals embedded in protein families in order to improve protein domain boundary prediction. The method first extracts putative domain boundary signals from a multiple sequence alignment between a query sequence and its homologs. The putative sites are then classified and scored by support vector machines in conjunction with input features such as sequence profiles, secondary structures, solvent accessibilities around the sites and their positions. The method was evaluated on a domain benchmark by 10-fold cross-validation and 60% of true domain boundaries can be recalled at a precision of 60%. The trade-off between the precision and recall can be adjusted according to specific needs by using different decision thresholds on the domain boundary scores assigned by the support vector machines. CONCLUSIONS: The good prediction accuracy and the flexibility of selecting domain boundary sites at different precision and recall values make our method a useful tool for protein structure determination and modelling. The method is available at http://sysbio.rnet.missouri.edu/dobo/.
format Text
id pubmed-3036623
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30366232011-02-10 DoBo: Protein domain boundary prediction by integrating evolutionary signals and machine learning Eickholt, Jesse Deng, Xin Cheng, Jianlin BMC Bioinformatics Research Article BACKGROUND: Accurate identification of protein domain boundaries is useful for protein structure determination and prediction. However, predicting protein domain boundaries from a sequence is still very challenging and largely unsolved. RESULTS: We developed a new method to integrate the classification power of machine learning with evolutionary signals embedded in protein families in order to improve protein domain boundary prediction. The method first extracts putative domain boundary signals from a multiple sequence alignment between a query sequence and its homologs. The putative sites are then classified and scored by support vector machines in conjunction with input features such as sequence profiles, secondary structures, solvent accessibilities around the sites and their positions. The method was evaluated on a domain benchmark by 10-fold cross-validation and 60% of true domain boundaries can be recalled at a precision of 60%. The trade-off between the precision and recall can be adjusted according to specific needs by using different decision thresholds on the domain boundary scores assigned by the support vector machines. CONCLUSIONS: The good prediction accuracy and the flexibility of selecting domain boundary sites at different precision and recall values make our method a useful tool for protein structure determination and modelling. The method is available at http://sysbio.rnet.missouri.edu/dobo/. BioMed Central 2011-02-01 /pmc/articles/PMC3036623/ /pubmed/21284866 http://dx.doi.org/10.1186/1471-2105-12-43 Text en Copyright ©2011 Eickholt et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Eickholt, Jesse
Deng, Xin
Cheng, Jianlin
DoBo: Protein domain boundary prediction by integrating evolutionary signals and machine learning
title DoBo: Protein domain boundary prediction by integrating evolutionary signals and machine learning
title_full DoBo: Protein domain boundary prediction by integrating evolutionary signals and machine learning
title_fullStr DoBo: Protein domain boundary prediction by integrating evolutionary signals and machine learning
title_full_unstemmed DoBo: Protein domain boundary prediction by integrating evolutionary signals and machine learning
title_short DoBo: Protein domain boundary prediction by integrating evolutionary signals and machine learning
title_sort dobo: protein domain boundary prediction by integrating evolutionary signals and machine learning
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3036623/
https://www.ncbi.nlm.nih.gov/pubmed/21284866
http://dx.doi.org/10.1186/1471-2105-12-43
work_keys_str_mv AT eickholtjesse doboproteindomainboundarypredictionbyintegratingevolutionarysignalsandmachinelearning
AT dengxin doboproteindomainboundarypredictionbyintegratingevolutionarysignalsandmachinelearning
AT chengjianlin doboproteindomainboundarypredictionbyintegratingevolutionarysignalsandmachinelearning