Cargando…

Systematic analysis of short internal indels and their impact on protein folding

BACKGROUND: Protein sequence insertions/deletions (indels) can be introduced during evolution or through alternative splicing (AS). Alternative splicing is an important biological phenomenon and is considered as the major means of expanding structural and functional diversity in eukaryotes. Knowledg...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, RyangGuk, Guo, Jun-tao
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2924343/
https://www.ncbi.nlm.nih.gov/pubmed/20684774
http://dx.doi.org/10.1186/1472-6807-10-24
_version_ 1782185577449259008
author Kim, RyangGuk
Guo, Jun-tao
author_facet Kim, RyangGuk
Guo, Jun-tao
author_sort Kim, RyangGuk
collection PubMed
description BACKGROUND: Protein sequence insertions/deletions (indels) can be introduced during evolution or through alternative splicing (AS). Alternative splicing is an important biological phenomenon and is considered as the major means of expanding structural and functional diversity in eukaryotes. Knowledge of the structural changes due to indels is critical to our understanding of the evolution of protein structure and function. In addition, it can help us probe the evolution of alternative splicing and the diversity of functional isoforms. However, little is known about the effects of indels, in particular the ones involving core secondary structures, on the folding of protein structures. The long term goal of our study is to accurately predict the protein AS isoform structures. As a first step towards this goal, we performed a systematic analysis on the structural changes caused by short internal indels through mining highly homologous proteins in Protein Data Bank (PDB). RESULTS: We compiled a non-redundant dataset of short internal indels (2-40 amino acids) from highly homologous protein pairs and analyzed the sequence and structural features of the indels. We found that about one third of indel residues are in disordered state and majority of the residues are exposed to solvent, suggesting that these indels are generally located on the surface of proteins. Though naturally occurring indels are fewer than engineered ones in the dataset, there are no statistically significant differences in terms of amino acid frequencies and secondary structure types between the "Natural" indels and "All" indels in the dataset. Structural comparisons show that all the protein pairs with short internal indels in the dataset preserve the structural folds and about 85% of protein pairs have global RMSDs (root mean square deviations) of 2Å or less, suggesting that protein structures tend to be conserved and can tolerate short insertions and deletions. A few pairs with high RMSDs are results of relative domain positions of the proteins, probably due to the intrinsically dynamic nature of the proteins. CONCLUSIONS: The analysis demonstrated that protein structures have the "plasticity" to tolerate short indels. This study can provide valuable guides in modeling protein AS isoform structures and homologous proteins with indels through placing the indels at the right locations since the accuracy of sequence alignments dictate model qualities in homology modeling.
format Text
id pubmed-2924343
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-29243432010-08-20 Systematic analysis of short internal indels and their impact on protein folding Kim, RyangGuk Guo, Jun-tao BMC Struct Biol Research Article BACKGROUND: Protein sequence insertions/deletions (indels) can be introduced during evolution or through alternative splicing (AS). Alternative splicing is an important biological phenomenon and is considered as the major means of expanding structural and functional diversity in eukaryotes. Knowledge of the structural changes due to indels is critical to our understanding of the evolution of protein structure and function. In addition, it can help us probe the evolution of alternative splicing and the diversity of functional isoforms. However, little is known about the effects of indels, in particular the ones involving core secondary structures, on the folding of protein structures. The long term goal of our study is to accurately predict the protein AS isoform structures. As a first step towards this goal, we performed a systematic analysis on the structural changes caused by short internal indels through mining highly homologous proteins in Protein Data Bank (PDB). RESULTS: We compiled a non-redundant dataset of short internal indels (2-40 amino acids) from highly homologous protein pairs and analyzed the sequence and structural features of the indels. We found that about one third of indel residues are in disordered state and majority of the residues are exposed to solvent, suggesting that these indels are generally located on the surface of proteins. Though naturally occurring indels are fewer than engineered ones in the dataset, there are no statistically significant differences in terms of amino acid frequencies and secondary structure types between the "Natural" indels and "All" indels in the dataset. Structural comparisons show that all the protein pairs with short internal indels in the dataset preserve the structural folds and about 85% of protein pairs have global RMSDs (root mean square deviations) of 2Å or less, suggesting that protein structures tend to be conserved and can tolerate short insertions and deletions. A few pairs with high RMSDs are results of relative domain positions of the proteins, probably due to the intrinsically dynamic nature of the proteins. CONCLUSIONS: The analysis demonstrated that protein structures have the "plasticity" to tolerate short indels. This study can provide valuable guides in modeling protein AS isoform structures and homologous proteins with indels through placing the indels at the right locations since the accuracy of sequence alignments dictate model qualities in homology modeling. BioMed Central 2010-08-04 /pmc/articles/PMC2924343/ /pubmed/20684774 http://dx.doi.org/10.1186/1472-6807-10-24 Text en Copyright ©2010 Kim and Guo; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Kim, RyangGuk
Guo, Jun-tao
Systematic analysis of short internal indels and their impact on protein folding
title Systematic analysis of short internal indels and their impact on protein folding
title_full Systematic analysis of short internal indels and their impact on protein folding
title_fullStr Systematic analysis of short internal indels and their impact on protein folding
title_full_unstemmed Systematic analysis of short internal indels and their impact on protein folding
title_short Systematic analysis of short internal indels and their impact on protein folding
title_sort systematic analysis of short internal indels and their impact on protein folding
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2924343/
https://www.ncbi.nlm.nih.gov/pubmed/20684774
http://dx.doi.org/10.1186/1472-6807-10-24
work_keys_str_mv AT kimryangguk systematicanalysisofshortinternalindelsandtheirimpactonproteinfolding
AT guojuntao systematicanalysisofshortinternalindelsandtheirimpactonproteinfolding