Cargando…

Protein structure search and local structure characterization

BACKGROUND: Structural similarities among proteins can provide valuable insight into their functional mechanisms and relationships. As the number of available three-dimensional (3D) protein structures increases, a greater variety of studies can be conducted with increasing efficiency, among which is...

Descripción completa

Detalles Bibliográficos
Autores principales: Ku, Shih-Yen, Hu, Yuh-Jyh
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2529324/
https://www.ncbi.nlm.nih.gov/pubmed/18721472
http://dx.doi.org/10.1186/1471-2105-9-349
_version_ 1782158917381390336
author Ku, Shih-Yen
Hu, Yuh-Jyh
author_facet Ku, Shih-Yen
Hu, Yuh-Jyh
author_sort Ku, Shih-Yen
collection PubMed
description BACKGROUND: Structural similarities among proteins can provide valuable insight into their functional mechanisms and relationships. As the number of available three-dimensional (3D) protein structures increases, a greater variety of studies can be conducted with increasing efficiency, among which is the design of protein structural alphabets. Structural alphabets allow us to characterize local structures of proteins and describe the global folding structure of a protein using a one-dimensional (1D) sequence. Thus, 1D sequences can be used to identify structural similarities among proteins using standard sequence alignment tools such as BLAST or FASTA. RESULTS: We used self-organizing maps in combination with a minimum spanning tree algorithm to determine the optimum size of a structural alphabet and applied the k-means algorithm to group protein fragnts into clusters. The centroids of these clusters defined the structural alphabet. We also developed a flexible matrix training system to build a substitution matrix (TRISUM-169) for our alphabet. Based on FASTA and using TRISUM-169 as the substitution matrix, we developed the SA-FAST alignment tool. We compared the performance of SA-FAST with that of various search tools in database-scale search tasks and found that SA-FAST was highly competitive in all tests conducted. Further, we evaluated the performance of our structural alphabet in recognizing specific structural domains of EGF and EGF-like proteins. Our method successfully recovered more EGF sub-domains using our structural alphabet than when using other structural alphabets. SA-FAST can be found at . CONCLUSION: The goal of this project was two-fold. First, we wanted to introduce a modular design pipeline to those who have been working with structural alphabets. Secondly, we wanted to open the door to researchers who have done substantial work in biological sequences but have yet to enter the field of protein structure research. Our experiments showed that by transforming the structural representations from 3D to 1D, several 1D-based tools can be applied to structural analysis, including similarity searches and structural motif finding.
format Text
id pubmed-2529324
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-25293242008-09-05 Protein structure search and local structure characterization Ku, Shih-Yen Hu, Yuh-Jyh BMC Bioinformatics Methodology Article BACKGROUND: Structural similarities among proteins can provide valuable insight into their functional mechanisms and relationships. As the number of available three-dimensional (3D) protein structures increases, a greater variety of studies can be conducted with increasing efficiency, among which is the design of protein structural alphabets. Structural alphabets allow us to characterize local structures of proteins and describe the global folding structure of a protein using a one-dimensional (1D) sequence. Thus, 1D sequences can be used to identify structural similarities among proteins using standard sequence alignment tools such as BLAST or FASTA. RESULTS: We used self-organizing maps in combination with a minimum spanning tree algorithm to determine the optimum size of a structural alphabet and applied the k-means algorithm to group protein fragnts into clusters. The centroids of these clusters defined the structural alphabet. We also developed a flexible matrix training system to build a substitution matrix (TRISUM-169) for our alphabet. Based on FASTA and using TRISUM-169 as the substitution matrix, we developed the SA-FAST alignment tool. We compared the performance of SA-FAST with that of various search tools in database-scale search tasks and found that SA-FAST was highly competitive in all tests conducted. Further, we evaluated the performance of our structural alphabet in recognizing specific structural domains of EGF and EGF-like proteins. Our method successfully recovered more EGF sub-domains using our structural alphabet than when using other structural alphabets. SA-FAST can be found at . CONCLUSION: The goal of this project was two-fold. First, we wanted to introduce a modular design pipeline to those who have been working with structural alphabets. Secondly, we wanted to open the door to researchers who have done substantial work in biological sequences but have yet to enter the field of protein structure research. Our experiments showed that by transforming the structural representations from 3D to 1D, several 1D-based tools can be applied to structural analysis, including similarity searches and structural motif finding. BioMed Central 2008-08-22 /pmc/articles/PMC2529324/ /pubmed/18721472 http://dx.doi.org/10.1186/1471-2105-9-349 Text en Copyright © 2008 Ku and Hu; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Ku, Shih-Yen
Hu, Yuh-Jyh
Protein structure search and local structure characterization
title Protein structure search and local structure characterization
title_full Protein structure search and local structure characterization
title_fullStr Protein structure search and local structure characterization
title_full_unstemmed Protein structure search and local structure characterization
title_short Protein structure search and local structure characterization
title_sort protein structure search and local structure characterization
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2529324/
https://www.ncbi.nlm.nih.gov/pubmed/18721472
http://dx.doi.org/10.1186/1471-2105-9-349
work_keys_str_mv AT kushihyen proteinstructuresearchandlocalstructurecharacterization
AT huyuhjyh proteinstructuresearchandlocalstructurecharacterization