Cargando…

Transmembrane protein topology prediction using support vector machines

BACKGROUND: Alpha-helical transmembrane (TM) proteins are involved in a wide range of important biological processes such as cell signaling, transport of membrane-impermeable molecules, cell-cell communication, cell recognition and cell adhesion. Many are also prime drug targets, and it has been est...

Descripción completa

Detalles Bibliográficos
Autores principales: Nugent, Timothy, Jones, David T
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2700806/
https://www.ncbi.nlm.nih.gov/pubmed/19470175
http://dx.doi.org/10.1186/1471-2105-10-159
_version_ 1782168656341368832
author Nugent, Timothy
Jones, David T
author_facet Nugent, Timothy
Jones, David T
author_sort Nugent, Timothy
collection PubMed
description BACKGROUND: Alpha-helical transmembrane (TM) proteins are involved in a wide range of important biological processes such as cell signaling, transport of membrane-impermeable molecules, cell-cell communication, cell recognition and cell adhesion. Many are also prime drug targets, and it has been estimated that more than half of all drugs currently on the market target membrane proteins. However, due to the experimental difficulties involved in obtaining high quality crystals, this class of protein is severely under-represented in structural databases. In the absence of structural data, sequence-based prediction methods allow TM protein topology to be investigated. RESULTS: We present a support vector machine-based (SVM) TM protein topology predictor that integrates both signal peptide and re-entrant helix prediction, benchmarked with full cross-validation on a novel data set of 131 sequences with known crystal structures. The method achieves topology prediction accuracy of 89%, while signal peptides and re-entrant helices are predicted with 93% and 44% accuracy respectively. An additional SVM trained to discriminate between globular and TM proteins detected zero false positives, with a low false negative rate of 0.4%. We present the results of applying these tools to a number of complete genomes. Source code, data sets and a web server are freely available from . CONCLUSION: The high accuracy of TM topology prediction which includes detection of both signal peptides and re-entrant helices, combined with the ability to effectively discriminate between TM and globular proteins, make this method ideally suited to whole genome annotation of alpha-helical transmembrane proteins.
format Text
id pubmed-2700806
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-27008062009-06-24 Transmembrane protein topology prediction using support vector machines Nugent, Timothy Jones, David T BMC Bioinformatics Research Article BACKGROUND: Alpha-helical transmembrane (TM) proteins are involved in a wide range of important biological processes such as cell signaling, transport of membrane-impermeable molecules, cell-cell communication, cell recognition and cell adhesion. Many are also prime drug targets, and it has been estimated that more than half of all drugs currently on the market target membrane proteins. However, due to the experimental difficulties involved in obtaining high quality crystals, this class of protein is severely under-represented in structural databases. In the absence of structural data, sequence-based prediction methods allow TM protein topology to be investigated. RESULTS: We present a support vector machine-based (SVM) TM protein topology predictor that integrates both signal peptide and re-entrant helix prediction, benchmarked with full cross-validation on a novel data set of 131 sequences with known crystal structures. The method achieves topology prediction accuracy of 89%, while signal peptides and re-entrant helices are predicted with 93% and 44% accuracy respectively. An additional SVM trained to discriminate between globular and TM proteins detected zero false positives, with a low false negative rate of 0.4%. We present the results of applying these tools to a number of complete genomes. Source code, data sets and a web server are freely available from . CONCLUSION: The high accuracy of TM topology prediction which includes detection of both signal peptides and re-entrant helices, combined with the ability to effectively discriminate between TM and globular proteins, make this method ideally suited to whole genome annotation of alpha-helical transmembrane proteins. BioMed Central 2009-05-26 /pmc/articles/PMC2700806/ /pubmed/19470175 http://dx.doi.org/10.1186/1471-2105-10-159 Text en Copyright © 2009 Nugent and Jones; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Nugent, Timothy
Jones, David T
Transmembrane protein topology prediction using support vector machines
title Transmembrane protein topology prediction using support vector machines
title_full Transmembrane protein topology prediction using support vector machines
title_fullStr Transmembrane protein topology prediction using support vector machines
title_full_unstemmed Transmembrane protein topology prediction using support vector machines
title_short Transmembrane protein topology prediction using support vector machines
title_sort transmembrane protein topology prediction using support vector machines
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2700806/
https://www.ncbi.nlm.nih.gov/pubmed/19470175
http://dx.doi.org/10.1186/1471-2105-10-159
work_keys_str_mv AT nugenttimothy transmembraneproteintopologypredictionusingsupportvectormachines
AT jonesdavidt transmembraneproteintopologypredictionusingsupportvectormachines