Cargando…

Prediction of amyloid fibril-forming segments based on a support vector machine

BACKGROUND: Amyloid fibrillar aggregates of proteins or polypeptides are known to be associated with many human diseases. Recent studies suggest that short protein regions trigger this aggregation. Thus, identifying these short peptides is critical for understanding diseases and finding potential th...

Descripción completa

Detalles Bibliográficos
Autores principales: Tian, Jian, Wu, Ningfeng, Guo, Jun, Fan, Yunliu
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2648769/
https://www.ncbi.nlm.nih.gov/pubmed/19208147
http://dx.doi.org/10.1186/1471-2105-10-S1-S45
_version_ 1782164983897915392
author Tian, Jian
Wu, Ningfeng
Guo, Jun
Fan, Yunliu
author_facet Tian, Jian
Wu, Ningfeng
Guo, Jun
Fan, Yunliu
author_sort Tian, Jian
collection PubMed
description BACKGROUND: Amyloid fibrillar aggregates of proteins or polypeptides are known to be associated with many human diseases. Recent studies suggest that short protein regions trigger this aggregation. Thus, identifying these short peptides is critical for understanding diseases and finding potential therapeutic targets. RESULTS: We propose a method, named Pafig (Prediction of amyloid fibril-forming segments) based on support vector machines, to identify the hexpeptides associated with amyloid fibrillar aggregates. The features of Pafig were obtained by a two-round selection from AAindex. Using a 10-fold cross validation test on Hexpepset dataset, Pafig performed well with regards to overall accuracy of 81% and Matthews correlation coefficient of 0.63. Pafig was used to predict the potential fibril-forming hexpeptides in all of the 64,000,000 hexpeptides. As a result, approximately 5.08% of hexpeptides showed a high aggregation propensity. In the predicted fibril-forming hexpeptides, the amino acids – alanine, phenylalanine, isoleucine, leucine and valine occurred at the higher frequencies and the amino acids – aspartic acid, glutamic acid, histidine, lysine, arginine and praline, appeared with lower frequencies. CONCLUSION: The performance of Pafig indicates that it is a powerful tool for identifying the hexpeptides associated with fibrillar aggregates and will be useful for large-scale analysis of proteomic data.
format Text
id pubmed-2648769
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26487692009-03-03 Prediction of amyloid fibril-forming segments based on a support vector machine Tian, Jian Wu, Ningfeng Guo, Jun Fan, Yunliu BMC Bioinformatics Research BACKGROUND: Amyloid fibrillar aggregates of proteins or polypeptides are known to be associated with many human diseases. Recent studies suggest that short protein regions trigger this aggregation. Thus, identifying these short peptides is critical for understanding diseases and finding potential therapeutic targets. RESULTS: We propose a method, named Pafig (Prediction of amyloid fibril-forming segments) based on support vector machines, to identify the hexpeptides associated with amyloid fibrillar aggregates. The features of Pafig were obtained by a two-round selection from AAindex. Using a 10-fold cross validation test on Hexpepset dataset, Pafig performed well with regards to overall accuracy of 81% and Matthews correlation coefficient of 0.63. Pafig was used to predict the potential fibril-forming hexpeptides in all of the 64,000,000 hexpeptides. As a result, approximately 5.08% of hexpeptides showed a high aggregation propensity. In the predicted fibril-forming hexpeptides, the amino acids – alanine, phenylalanine, isoleucine, leucine and valine occurred at the higher frequencies and the amino acids – aspartic acid, glutamic acid, histidine, lysine, arginine and praline, appeared with lower frequencies. CONCLUSION: The performance of Pafig indicates that it is a powerful tool for identifying the hexpeptides associated with fibrillar aggregates and will be useful for large-scale analysis of proteomic data. BioMed Central 2009-01-30 /pmc/articles/PMC2648769/ /pubmed/19208147 http://dx.doi.org/10.1186/1471-2105-10-S1-S45 Text en Copyright © 2009 Tian et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Tian, Jian
Wu, Ningfeng
Guo, Jun
Fan, Yunliu
Prediction of amyloid fibril-forming segments based on a support vector machine
title Prediction of amyloid fibril-forming segments based on a support vector machine
title_full Prediction of amyloid fibril-forming segments based on a support vector machine
title_fullStr Prediction of amyloid fibril-forming segments based on a support vector machine
title_full_unstemmed Prediction of amyloid fibril-forming segments based on a support vector machine
title_short Prediction of amyloid fibril-forming segments based on a support vector machine
title_sort prediction of amyloid fibril-forming segments based on a support vector machine
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2648769/
https://www.ncbi.nlm.nih.gov/pubmed/19208147
http://dx.doi.org/10.1186/1471-2105-10-S1-S45
work_keys_str_mv AT tianjian predictionofamyloidfibrilformingsegmentsbasedonasupportvectormachine
AT wuningfeng predictionofamyloidfibrilformingsegmentsbasedonasupportvectormachine
AT guojun predictionofamyloidfibrilformingsegmentsbasedonasupportvectormachine
AT fanyunliu predictionofamyloidfibrilformingsegmentsbasedonasupportvectormachine