Cargando…
Prediction of Transmembrane Proteins from Their Primary Sequence by Support Vector Machine Approach
Prediction of transmembrane (TM) proteins from their sequence facilitates functional study of genomes and the search of potential membrane-associated therapeutic targets. Computational methods for predicting TM sequences have been developed. These methods achieve high prediction accuracy for many TM...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2006
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7121931/ http://dx.doi.org/10.1007/11816102_56 |
_version_ | 1783515309648379904 |
---|---|
author | Cai, C. Z. Yuan, Q. F. Xiao, H. G. Liu, X. H. Han, L. Y. Chen, Y. Z. |
author_facet | Cai, C. Z. Yuan, Q. F. Xiao, H. G. Liu, X. H. Han, L. Y. Chen, Y. Z. |
author_sort | Cai, C. Z. |
collection | PubMed |
description | Prediction of transmembrane (TM) proteins from their sequence facilitates functional study of genomes and the search of potential membrane-associated therapeutic targets. Computational methods for predicting TM sequences have been developed. These methods achieve high prediction accuracy for many TM proteins but some of these methods are less effective for specific class of TM proteins. Moreover, their performance has been tested by using a relatively small set of TM and non-membrane (NM) proteins. Thus it is useful to evaluate TM protein prediction methods by using a more diverse set of proteins and by testing their performance on specific classes of TM proteins. This work extensively evaluated the capability of support vector machine (SVM) classification systems for the prediction of TM proteins and those of several TM classes. These SVM systems were trained and tested by using 14962 TM and 12168 NM proteins from Pfam protein families. An independent set of 3389 TM and 6063 NM proteins from curated Pfam families were used to further evaluate the performance of these SVM systems. 90.1% and 86.7% of TM and NM proteins were correctly predicted respectively, which are comparable to those from other studies. The prediction accuracies for proteins of specific TM classes are 95.6%, 90.0%, 92.7% and 73.9% for G-protein coupled receptors, envelope proteins, outer membrane proteins, and transporters/channels respectively; and 98.1%, 99.5%, 86.4%, and 98.6% for non-G-protein coupled receptors, non-envelope proteins, non-outer membrane proteins, and non-transporters/non-channels respectively. Tested by using a significantly larger number and more diverse range of proteins than in previous studies, SVM systems appear to be capable of prediction of TM proteins and proteins of specific TM classes at accuracies comparable to those from previous studies. Our SVM systems – SVMProt, can be accessed at http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi. |
format | Online Article Text |
id | pubmed-7121931 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2006 |
record_format | MEDLINE/PubMed |
spelling | pubmed-71219312020-04-06 Prediction of Transmembrane Proteins from Their Primary Sequence by Support Vector Machine Approach Cai, C. Z. Yuan, Q. F. Xiao, H. G. Liu, X. H. Han, L. Y. Chen, Y. Z. Computational Intelligence and Bioinformatics Article Prediction of transmembrane (TM) proteins from their sequence facilitates functional study of genomes and the search of potential membrane-associated therapeutic targets. Computational methods for predicting TM sequences have been developed. These methods achieve high prediction accuracy for many TM proteins but some of these methods are less effective for specific class of TM proteins. Moreover, their performance has been tested by using a relatively small set of TM and non-membrane (NM) proteins. Thus it is useful to evaluate TM protein prediction methods by using a more diverse set of proteins and by testing their performance on specific classes of TM proteins. This work extensively evaluated the capability of support vector machine (SVM) classification systems for the prediction of TM proteins and those of several TM classes. These SVM systems were trained and tested by using 14962 TM and 12168 NM proteins from Pfam protein families. An independent set of 3389 TM and 6063 NM proteins from curated Pfam families were used to further evaluate the performance of these SVM systems. 90.1% and 86.7% of TM and NM proteins were correctly predicted respectively, which are comparable to those from other studies. The prediction accuracies for proteins of specific TM classes are 95.6%, 90.0%, 92.7% and 73.9% for G-protein coupled receptors, envelope proteins, outer membrane proteins, and transporters/channels respectively; and 98.1%, 99.5%, 86.4%, and 98.6% for non-G-protein coupled receptors, non-envelope proteins, non-outer membrane proteins, and non-transporters/non-channels respectively. Tested by using a significantly larger number and more diverse range of proteins than in previous studies, SVM systems appear to be capable of prediction of TM proteins and proteins of specific TM classes at accuracies comparable to those from previous studies. Our SVM systems – SVMProt, can be accessed at http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi. 2006 /pmc/articles/PMC7121931/ http://dx.doi.org/10.1007/11816102_56 Text en © Springer-Verlag Berlin Heidelberg 2006 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Article Cai, C. Z. Yuan, Q. F. Xiao, H. G. Liu, X. H. Han, L. Y. Chen, Y. Z. Prediction of Transmembrane Proteins from Their Primary Sequence by Support Vector Machine Approach |
title | Prediction of Transmembrane Proteins from Their Primary Sequence by Support Vector Machine Approach |
title_full | Prediction of Transmembrane Proteins from Their Primary Sequence by Support Vector Machine Approach |
title_fullStr | Prediction of Transmembrane Proteins from Their Primary Sequence by Support Vector Machine Approach |
title_full_unstemmed | Prediction of Transmembrane Proteins from Their Primary Sequence by Support Vector Machine Approach |
title_short | Prediction of Transmembrane Proteins from Their Primary Sequence by Support Vector Machine Approach |
title_sort | prediction of transmembrane proteins from their primary sequence by support vector machine approach |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7121931/ http://dx.doi.org/10.1007/11816102_56 |
work_keys_str_mv | AT caicz predictionoftransmembraneproteinsfromtheirprimarysequencebysupportvectormachineapproach AT yuanqf predictionoftransmembraneproteinsfromtheirprimarysequencebysupportvectormachineapproach AT xiaohg predictionoftransmembraneproteinsfromtheirprimarysequencebysupportvectormachineapproach AT liuxh predictionoftransmembraneproteinsfromtheirprimarysequencebysupportvectormachineapproach AT hanly predictionoftransmembraneproteinsfromtheirprimarysequencebysupportvectormachineapproach AT chenyz predictionoftransmembraneproteinsfromtheirprimarysequencebysupportvectormachineapproach |