Cargando…

Length constraints of multi­domain proteins in metazoans

The increasing number of annotated genome sequences in public databases has made it possible to study the length distributions and domain composition of proteins at unprecedented resolution. To identify factors that influence protein length in metazoans, we performed an analysis of all domain­annota...

Descripción completa

Detalles Bibliográficos
Autores principales: Middleton, Sarah, Song, Timothy, Nayak, Sudhir
Formato: Texto
Lenguaje:English
Publicado: Biomedical Informatics Publishing Group 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2951704/
https://www.ncbi.nlm.nih.gov/pubmed/20975906
_version_ 1782187727508209664
author Middleton, Sarah
Song, Timothy
Nayak, Sudhir
author_facet Middleton, Sarah
Song, Timothy
Nayak, Sudhir
author_sort Middleton, Sarah
collection PubMed
description The increasing number of annotated genome sequences in public databases has made it possible to study the length distributions and domain composition of proteins at unprecedented resolution. To identify factors that influence protein length in metazoans, we performed an analysis of all domain­annotated proteins from a total of 49 animal species from Ensembl (v.56) or EnsemblMetazoa (v.3). Our results indicate that protein length constraints are not fixed as a linear function of domain count and can vary based on domain content. The presence of repeating domains was associated with relaxation of the constraints that govern protein length. Conversely, for proteins with unique domains, length constraints were generally maintained with increased domain counts. It is clear that mean (and median) protein length and domain composition vary significantly between metazoans and other kingdoms; however, the connections between function, domain content, and length are unclear. We incorporated Gene Ontology (GO) annotation to identify biological processes, cellular components, or molecular functions that favor the incorporation of multi­domain proteins. Using this approach, we identified multiple GO terms that favor the incorporation of multi-domain proteins; interestingly, several of the GO terms with elevated domain counts were not restricted to a single gene family. The findings presented here represent an important step in resolving the complex relationship between protein length, function, and domain content. The comparison of the data presented in this work to data from other kingdoms is likely to reveal additional differences in the regulation of protein length.
format Text
id pubmed-2951704
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Biomedical Informatics Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-29517042010-10-25 Length constraints of multi­domain proteins in metazoans Middleton, Sarah Song, Timothy Nayak, Sudhir Bioinformation Hypothesis The increasing number of annotated genome sequences in public databases has made it possible to study the length distributions and domain composition of proteins at unprecedented resolution. To identify factors that influence protein length in metazoans, we performed an analysis of all domain­annotated proteins from a total of 49 animal species from Ensembl (v.56) or EnsemblMetazoa (v.3). Our results indicate that protein length constraints are not fixed as a linear function of domain count and can vary based on domain content. The presence of repeating domains was associated with relaxation of the constraints that govern protein length. Conversely, for proteins with unique domains, length constraints were generally maintained with increased domain counts. It is clear that mean (and median) protein length and domain composition vary significantly between metazoans and other kingdoms; however, the connections between function, domain content, and length are unclear. We incorporated Gene Ontology (GO) annotation to identify biological processes, cellular components, or molecular functions that favor the incorporation of multi­domain proteins. Using this approach, we identified multiple GO terms that favor the incorporation of multi-domain proteins; interestingly, several of the GO terms with elevated domain counts were not restricted to a single gene family. The findings presented here represent an important step in resolving the complex relationship between protein length, function, and domain content. The comparison of the data presented in this work to data from other kingdoms is likely to reveal additional differences in the regulation of protein length. Biomedical Informatics Publishing Group 2010-04-30 /pmc/articles/PMC2951704/ /pubmed/20975906 Text en © 2010 Biomedical Informatics Publishing Group This is an open-access article, which permits unrestricted use, distribution, and reproduction in any medium, for non-commercial purposes, provided the original author and source are credited.
spellingShingle Hypothesis
Middleton, Sarah
Song, Timothy
Nayak, Sudhir
Length constraints of multi­domain proteins in metazoans
title Length constraints of multi­domain proteins in metazoans
title_full Length constraints of multi­domain proteins in metazoans
title_fullStr Length constraints of multi­domain proteins in metazoans
title_full_unstemmed Length constraints of multi­domain proteins in metazoans
title_short Length constraints of multi­domain proteins in metazoans
title_sort length constraints of multi­domain proteins in metazoans
topic Hypothesis
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2951704/
https://www.ncbi.nlm.nih.gov/pubmed/20975906
work_keys_str_mv AT middletonsarah lengthconstraintsofmultidomainproteinsinmetazoans
AT songtimothy lengthconstraintsofmultidomainproteinsinmetazoans
AT nayaksudhir lengthconstraintsofmultidomainproteinsinmetazoans