Cargando…

Identification of similar regions of protein structures using integrated sequence and structure analysis tools

BACKGROUND: Understanding protein function from its structure is a challenging problem. Sequence based approaches for finding homology have broad use for annotation of both structure and function. 3D structural information of protein domains and their interactions provide a complementary view to str...

Descripción completa

Detalles Bibliográficos
Autores principales:	Peters, Brandon, Moad, Charles, Youn, Eunseog, Buffington, Kris, Heiland, Randy, Mooney, Sean
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2006
Materias:	Software
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1435900/ https://www.ncbi.nlm.nih.gov/pubmed/16526955 http://dx.doi.org/10.1186/1472-6807-6-4

_version_	1782127290274021376
author	Peters, Brandon Moad, Charles Youn, Eunseog Buffington, Kris Heiland, Randy Mooney, Sean
author_facet	Peters, Brandon Moad, Charles Youn, Eunseog Buffington, Kris Heiland, Randy Mooney, Sean
author_sort	Peters, Brandon
collection	PubMed
description	BACKGROUND: Understanding protein function from its structure is a challenging problem. Sequence based approaches for finding homology have broad use for annotation of both structure and function. 3D structural information of protein domains and their interactions provide a complementary view to structure function relationships to sequence information. We have developed a web site and an API of web services that enables users to submit protein structures and identify statistically significant neighbors and the underlying structural environments that make that match using a suite of sequence and structure analysis tools. To do this, we have integrated S-BLEST, PSI-BLAST and HMMer based superfamily predictions to give a unique integrated view to prediction of SCOP superfamilies, EC number, and GO term, as well as identification of the protein structural environments that are associated with that prediction. Additionally, we have extended UCSF Chimera and PyMOL to support our web services, so that users can characterize their own proteins of interest. RESULTS: Users are able to submit their own queries or use a structure already in the PDB. Currently the databases that a user can query include the popular structural datasets ASTRAL 40 v1.69, ASTRAL 95 v1.69, CLUSTER50, CLUSTER70 and CLUSTER90 and PDBSELECT25. The results can be downloaded directly from the site and include function prediction, analysis of the most conserved environments and automated annotation of query proteins. These results reflect both the hits found with PSI-BLAST, HMMer and with S-BLEST. We have evaluated how well annotation transfer can be performed on SCOP ID's, Gene Ontology (GO) ID's and EC Numbers. The method is very efficient and totally automated, generally taking around fifteen minutes for a 400 residue protein. CONCLUSION: With structural genomics initiatives determining structures with little, if any, functional characterization, development of protein structure and function analysis tools are a necessary endeavor. We have developed a useful application towards a solution to this problem using common structural and sequence based analysis tools. These approaches are able to find statistically significant environments in a database of protein structure, and the method is able to quantify how closely associated each environment is to a predicted functional annotation.
format	Text
id	pubmed-1435900
institution	National Center for Biotechnology Information
language	English
publishDate	2006
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-14359002006-04-14 Identification of similar regions of protein structures using integrated sequence and structure analysis tools Peters, Brandon Moad, Charles Youn, Eunseog Buffington, Kris Heiland, Randy Mooney, Sean BMC Struct Biol Software BACKGROUND: Understanding protein function from its structure is a challenging problem. Sequence based approaches for finding homology have broad use for annotation of both structure and function. 3D structural information of protein domains and their interactions provide a complementary view to structure function relationships to sequence information. We have developed a web site and an API of web services that enables users to submit protein structures and identify statistically significant neighbors and the underlying structural environments that make that match using a suite of sequence and structure analysis tools. To do this, we have integrated S-BLEST, PSI-BLAST and HMMer based superfamily predictions to give a unique integrated view to prediction of SCOP superfamilies, EC number, and GO term, as well as identification of the protein structural environments that are associated with that prediction. Additionally, we have extended UCSF Chimera and PyMOL to support our web services, so that users can characterize their own proteins of interest. RESULTS: Users are able to submit their own queries or use a structure already in the PDB. Currently the databases that a user can query include the popular structural datasets ASTRAL 40 v1.69, ASTRAL 95 v1.69, CLUSTER50, CLUSTER70 and CLUSTER90 and PDBSELECT25. The results can be downloaded directly from the site and include function prediction, analysis of the most conserved environments and automated annotation of query proteins. These results reflect both the hits found with PSI-BLAST, HMMer and with S-BLEST. We have evaluated how well annotation transfer can be performed on SCOP ID's, Gene Ontology (GO) ID's and EC Numbers. The method is very efficient and totally automated, generally taking around fifteen minutes for a 400 residue protein. CONCLUSION: With structural genomics initiatives determining structures with little, if any, functional characterization, development of protein structure and function analysis tools are a necessary endeavor. We have developed a useful application towards a solution to this problem using common structural and sequence based analysis tools. These approaches are able to find statistically significant environments in a database of protein structure, and the method is able to quantify how closely associated each environment is to a predicted functional annotation. BioMed Central 2006-03-09 /pmc/articles/PMC1435900/ /pubmed/16526955 http://dx.doi.org/10.1186/1472-6807-6-4 Text en Copyright © 2006 Peters et al; licensee BioMed Central Ltd.
spellingShingle	Software Peters, Brandon Moad, Charles Youn, Eunseog Buffington, Kris Heiland, Randy Mooney, Sean Identification of similar regions of protein structures using integrated sequence and structure analysis tools
title	Identification of similar regions of protein structures using integrated sequence and structure analysis tools
title_full	Identification of similar regions of protein structures using integrated sequence and structure analysis tools
title_fullStr	Identification of similar regions of protein structures using integrated sequence and structure analysis tools
title_full_unstemmed	Identification of similar regions of protein structures using integrated sequence and structure analysis tools
title_short	Identification of similar regions of protein structures using integrated sequence and structure analysis tools
title_sort	identification of similar regions of protein structures using integrated sequence and structure analysis tools
topic	Software
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1435900/ https://www.ncbi.nlm.nih.gov/pubmed/16526955 http://dx.doi.org/10.1186/1472-6807-6-4
work_keys_str_mv	AT petersbrandon identificationofsimilarregionsofproteinstructuresusingintegratedsequenceandstructureanalysistools AT moadcharles identificationofsimilarregionsofproteinstructuresusingintegratedsequenceandstructureanalysistools AT youneunseog identificationofsimilarregionsofproteinstructuresusingintegratedsequenceandstructureanalysistools AT buffingtonkris identificationofsimilarregionsofproteinstructuresusingintegratedsequenceandstructureanalysistools AT heilandrandy identificationofsimilarregionsofproteinstructuresusingintegratedsequenceandstructureanalysistools AT mooneysean identificationofsimilarregionsofproteinstructuresusingintegratedsequenceandstructureanalysistools

Identification of similar regions of protein structures using integrated sequence and structure analysis tools

Ejemplares similares