Cargando…

Identifying foldable regions in protein sequence from the hydrophobic signal

Structural genomics initiatives aim to elucidate representative 3D structures for the majority of protein families over the next decade, but many obstacles must be overcome. The correct design of constructs is extremely important since many proteins will be too large or contain unstructured regions...

Descripción completa

Detalles Bibliográficos
Autores principales: Pang, Chi N.I., Lin, Kuang, Wouters, Merridee A., Heringa, Jaap, George, Richard A.
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2241846/
https://www.ncbi.nlm.nih.gov/pubmed/18056079
http://dx.doi.org/10.1093/nar/gkm1070
_version_ 1782150540204965888
author Pang, Chi N.I.
Lin, Kuang
Wouters, Merridee A.
Heringa, Jaap
George, Richard A.
author_facet Pang, Chi N.I.
Lin, Kuang
Wouters, Merridee A.
Heringa, Jaap
George, Richard A.
author_sort Pang, Chi N.I.
collection PubMed
description Structural genomics initiatives aim to elucidate representative 3D structures for the majority of protein families over the next decade, but many obstacles must be overcome. The correct design of constructs is extremely important since many proteins will be too large or contain unstructured regions and will not be amenable to crystallization. It is therefore essential to identify regions in protein sequences that are likely to be suitable for structural study. Scooby-Domain is a fast and simple method to identify globular domains in protein sequences. Domains are compact units of protein structure and their correct delineation will aid structural elucidation through a divide-and-conquer approach. Scooby-Domain predictions are based on the observed lengths and hydrophobicities of domains from proteins with known tertiary structure. The prediction method employs an A*-search to identify sequence regions that form a globular structure and those that are unstructured. On a test set of 173 proteins with consensus CATH and SCOP domain definitions, Scooby-Domain has a sensitivity of 50% and an accuracy of 29%, which is better than current state-of-the-art methods. The method does not rely on homology searches and, therefore, can identify previously unknown domains.
format Text
id pubmed-2241846
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-22418462008-02-21 Identifying foldable regions in protein sequence from the hydrophobic signal Pang, Chi N.I. Lin, Kuang Wouters, Merridee A. Heringa, Jaap George, Richard A. Nucleic Acids Res Computational Biology Structural genomics initiatives aim to elucidate representative 3D structures for the majority of protein families over the next decade, but many obstacles must be overcome. The correct design of constructs is extremely important since many proteins will be too large or contain unstructured regions and will not be amenable to crystallization. It is therefore essential to identify regions in protein sequences that are likely to be suitable for structural study. Scooby-Domain is a fast and simple method to identify globular domains in protein sequences. Domains are compact units of protein structure and their correct delineation will aid structural elucidation through a divide-and-conquer approach. Scooby-Domain predictions are based on the observed lengths and hydrophobicities of domains from proteins with known tertiary structure. The prediction method employs an A*-search to identify sequence regions that form a globular structure and those that are unstructured. On a test set of 173 proteins with consensus CATH and SCOP domain definitions, Scooby-Domain has a sensitivity of 50% and an accuracy of 29%, which is better than current state-of-the-art methods. The method does not rely on homology searches and, therefore, can identify previously unknown domains. Oxford University Press 2008-02 2007-12-01 /pmc/articles/PMC2241846/ /pubmed/18056079 http://dx.doi.org/10.1093/nar/gkm1070 Text en © 2007 The Author(s) http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Computational Biology
Pang, Chi N.I.
Lin, Kuang
Wouters, Merridee A.
Heringa, Jaap
George, Richard A.
Identifying foldable regions in protein sequence from the hydrophobic signal
title Identifying foldable regions in protein sequence from the hydrophobic signal
title_full Identifying foldable regions in protein sequence from the hydrophobic signal
title_fullStr Identifying foldable regions in protein sequence from the hydrophobic signal
title_full_unstemmed Identifying foldable regions in protein sequence from the hydrophobic signal
title_short Identifying foldable regions in protein sequence from the hydrophobic signal
title_sort identifying foldable regions in protein sequence from the hydrophobic signal
topic Computational Biology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2241846/
https://www.ncbi.nlm.nih.gov/pubmed/18056079
http://dx.doi.org/10.1093/nar/gkm1070
work_keys_str_mv AT pangchini identifyingfoldableregionsinproteinsequencefromthehydrophobicsignal
AT linkuang identifyingfoldableregionsinproteinsequencefromthehydrophobicsignal
AT woutersmerrideea identifyingfoldableregionsinproteinsequencefromthehydrophobicsignal
AT heringajaap identifyingfoldableregionsinproteinsequencefromthehydrophobicsignal
AT georgericharda identifyingfoldableregionsinproteinsequencefromthehydrophobicsignal