Cargando…

Prediction of Transmembrane Proteins from Their Primary Sequence by Support Vector Machine Approach

Prediction of transmembrane (TM) proteins from their sequence facilitates functional study of genomes and the search of potential membrane-associated therapeutic targets. Computational methods for predicting TM sequences have been developed. These methods achieve high prediction accuracy for many TM...

Descripción completa

Detalles Bibliográficos
Autores principales: Cai, C. Z., Yuan, Q. F., Xiao, H. G., Liu, X. H., Han, L. Y., Chen, Y. Z.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7121931/
http://dx.doi.org/10.1007/11816102_56
_version_ 1783515309648379904
author Cai, C. Z.
Yuan, Q. F.
Xiao, H. G.
Liu, X. H.
Han, L. Y.
Chen, Y. Z.
author_facet Cai, C. Z.
Yuan, Q. F.
Xiao, H. G.
Liu, X. H.
Han, L. Y.
Chen, Y. Z.
author_sort Cai, C. Z.
collection PubMed
description Prediction of transmembrane (TM) proteins from their sequence facilitates functional study of genomes and the search of potential membrane-associated therapeutic targets. Computational methods for predicting TM sequences have been developed. These methods achieve high prediction accuracy for many TM proteins but some of these methods are less effective for specific class of TM proteins. Moreover, their performance has been tested by using a relatively small set of TM and non-membrane (NM) proteins. Thus it is useful to evaluate TM protein prediction methods by using a more diverse set of proteins and by testing their performance on specific classes of TM proteins. This work extensively evaluated the capability of support vector machine (SVM) classification systems for the prediction of TM proteins and those of several TM classes. These SVM systems were trained and tested by using 14962 TM and 12168 NM proteins from Pfam protein families. An independent set of 3389 TM and 6063 NM proteins from curated Pfam families were used to further evaluate the performance of these SVM systems. 90.1% and 86.7% of TM and NM proteins were correctly predicted respectively, which are comparable to those from other studies. The prediction accuracies for proteins of specific TM classes are 95.6%, 90.0%, 92.7% and 73.9% for G-protein coupled receptors, envelope proteins, outer membrane proteins, and transporters/channels respectively; and 98.1%, 99.5%, 86.4%, and 98.6% for non-G-protein coupled receptors, non-envelope proteins, non-outer membrane proteins, and non-transporters/non-channels respectively. Tested by using a significantly larger number and more diverse range of proteins than in previous studies, SVM systems appear to be capable of prediction of TM proteins and proteins of specific TM classes at accuracies comparable to those from previous studies. Our SVM systems – SVMProt, can be accessed at http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi.
format Online
Article
Text
id pubmed-7121931
institution National Center for Biotechnology Information
language English
publishDate 2006
record_format MEDLINE/PubMed
spelling pubmed-71219312020-04-06 Prediction of Transmembrane Proteins from Their Primary Sequence by Support Vector Machine Approach Cai, C. Z. Yuan, Q. F. Xiao, H. G. Liu, X. H. Han, L. Y. Chen, Y. Z. Computational Intelligence and Bioinformatics Article Prediction of transmembrane (TM) proteins from their sequence facilitates functional study of genomes and the search of potential membrane-associated therapeutic targets. Computational methods for predicting TM sequences have been developed. These methods achieve high prediction accuracy for many TM proteins but some of these methods are less effective for specific class of TM proteins. Moreover, their performance has been tested by using a relatively small set of TM and non-membrane (NM) proteins. Thus it is useful to evaluate TM protein prediction methods by using a more diverse set of proteins and by testing their performance on specific classes of TM proteins. This work extensively evaluated the capability of support vector machine (SVM) classification systems for the prediction of TM proteins and those of several TM classes. These SVM systems were trained and tested by using 14962 TM and 12168 NM proteins from Pfam protein families. An independent set of 3389 TM and 6063 NM proteins from curated Pfam families were used to further evaluate the performance of these SVM systems. 90.1% and 86.7% of TM and NM proteins were correctly predicted respectively, which are comparable to those from other studies. The prediction accuracies for proteins of specific TM classes are 95.6%, 90.0%, 92.7% and 73.9% for G-protein coupled receptors, envelope proteins, outer membrane proteins, and transporters/channels respectively; and 98.1%, 99.5%, 86.4%, and 98.6% for non-G-protein coupled receptors, non-envelope proteins, non-outer membrane proteins, and non-transporters/non-channels respectively. Tested by using a significantly larger number and more diverse range of proteins than in previous studies, SVM systems appear to be capable of prediction of TM proteins and proteins of specific TM classes at accuracies comparable to those from previous studies. Our SVM systems – SVMProt, can be accessed at http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi. 2006 /pmc/articles/PMC7121931/ http://dx.doi.org/10.1007/11816102_56 Text en © Springer-Verlag Berlin Heidelberg 2006 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Article
Cai, C. Z.
Yuan, Q. F.
Xiao, H. G.
Liu, X. H.
Han, L. Y.
Chen, Y. Z.
Prediction of Transmembrane Proteins from Their Primary Sequence by Support Vector Machine Approach
title Prediction of Transmembrane Proteins from Their Primary Sequence by Support Vector Machine Approach
title_full Prediction of Transmembrane Proteins from Their Primary Sequence by Support Vector Machine Approach
title_fullStr Prediction of Transmembrane Proteins from Their Primary Sequence by Support Vector Machine Approach
title_full_unstemmed Prediction of Transmembrane Proteins from Their Primary Sequence by Support Vector Machine Approach
title_short Prediction of Transmembrane Proteins from Their Primary Sequence by Support Vector Machine Approach
title_sort prediction of transmembrane proteins from their primary sequence by support vector machine approach
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7121931/
http://dx.doi.org/10.1007/11816102_56
work_keys_str_mv AT caicz predictionoftransmembraneproteinsfromtheirprimarysequencebysupportvectormachineapproach
AT yuanqf predictionoftransmembraneproteinsfromtheirprimarysequencebysupportvectormachineapproach
AT xiaohg predictionoftransmembraneproteinsfromtheirprimarysequencebysupportvectormachineapproach
AT liuxh predictionoftransmembraneproteinsfromtheirprimarysequencebysupportvectormachineapproach
AT hanly predictionoftransmembraneproteinsfromtheirprimarysequencebysupportvectormachineapproach
AT chenyz predictionoftransmembraneproteinsfromtheirprimarysequencebysupportvectormachineapproach