Cargando…

Exhaustive assignment of compositional bias reveals universally prevalent biased regions: analysis of functional associations in human and Drosophila

BACKGROUND: Compositionally biased (CB) regions are stretches in protein sequences made from mainly a distinct subset of amino acid residues; such regions are frequently associated with a structural role in the cell, or with protein disorder. RESULTS: We derived a procedure for the exhaustive assign...

Descripción completa

Detalles Bibliográficos
Autor principal: Harrison, Paul M
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1618407/
https://www.ncbi.nlm.nih.gov/pubmed/17032452
http://dx.doi.org/10.1186/1471-2105-7-441
_version_ 1782130519856644096
author Harrison, Paul M
author_facet Harrison, Paul M
author_sort Harrison, Paul M
collection PubMed
description BACKGROUND: Compositionally biased (CB) regions are stretches in protein sequences made from mainly a distinct subset of amino acid residues; such regions are frequently associated with a structural role in the cell, or with protein disorder. RESULTS: We derived a procedure for the exhaustive assignment and classification of CB regions, and have applied it to thirteen metazoan proteomes. Sequences are initially scanned for the lowest-probability subsequences (LPSs) for single amino-acid types; subsequently, an exhaustive search for lowest probability subsequences (LPSs) for multiple residue types is performed iteratively until convergence, to define CB region boundaries. We analysed > 40,000 CB regions with > 20 million residues; strikingly, nine single-/double- residue biases are universally abundant, and are consistently highly ranked across both vertebrates and invertebrates. To home in subpopulations of CB regions of interest in human and D. melanogaster, we analysed CB region lengths, conservation, inferred functional categories and predicted protein disorder, and filtered for coiled coils and protein structures. In particular, we found that some of the universally abundant CB regions have significant associations to transcription and nuclear localization in Human and Drosophila, and are also predicted to be moderately or highly disordered. Focussing on Q-based biased regions, we found that these regions are typically only well conserved within mammals (appearing in 60–80% of orthologs), with shorter human transcription-related CB regions being unconserved outside of mammals; they are also preferentially linked to protein domains such as the homeodomain and glucocorticoid-receptor DNA-binding domain. In general, only ~40–50% of residues in these human and Drosophila CB regions have predicted protein disorder. CONCLUSION: This data is of use for the further functional characterization of genes, and for structural genomics initiatives.
format Text
id pubmed-1618407
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-16184072006-10-20 Exhaustive assignment of compositional bias reveals universally prevalent biased regions: analysis of functional associations in human and Drosophila Harrison, Paul M BMC Bioinformatics Research Article BACKGROUND: Compositionally biased (CB) regions are stretches in protein sequences made from mainly a distinct subset of amino acid residues; such regions are frequently associated with a structural role in the cell, or with protein disorder. RESULTS: We derived a procedure for the exhaustive assignment and classification of CB regions, and have applied it to thirteen metazoan proteomes. Sequences are initially scanned for the lowest-probability subsequences (LPSs) for single amino-acid types; subsequently, an exhaustive search for lowest probability subsequences (LPSs) for multiple residue types is performed iteratively until convergence, to define CB region boundaries. We analysed > 40,000 CB regions with > 20 million residues; strikingly, nine single-/double- residue biases are universally abundant, and are consistently highly ranked across both vertebrates and invertebrates. To home in subpopulations of CB regions of interest in human and D. melanogaster, we analysed CB region lengths, conservation, inferred functional categories and predicted protein disorder, and filtered for coiled coils and protein structures. In particular, we found that some of the universally abundant CB regions have significant associations to transcription and nuclear localization in Human and Drosophila, and are also predicted to be moderately or highly disordered. Focussing on Q-based biased regions, we found that these regions are typically only well conserved within mammals (appearing in 60–80% of orthologs), with shorter human transcription-related CB regions being unconserved outside of mammals; they are also preferentially linked to protein domains such as the homeodomain and glucocorticoid-receptor DNA-binding domain. In general, only ~40–50% of residues in these human and Drosophila CB regions have predicted protein disorder. CONCLUSION: This data is of use for the further functional characterization of genes, and for structural genomics initiatives. BioMed Central 2006-10-10 /pmc/articles/PMC1618407/ /pubmed/17032452 http://dx.doi.org/10.1186/1471-2105-7-441 Text en Copyright © 2006 Harrison; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Harrison, Paul M
Exhaustive assignment of compositional bias reveals universally prevalent biased regions: analysis of functional associations in human and Drosophila
title Exhaustive assignment of compositional bias reveals universally prevalent biased regions: analysis of functional associations in human and Drosophila
title_full Exhaustive assignment of compositional bias reveals universally prevalent biased regions: analysis of functional associations in human and Drosophila
title_fullStr Exhaustive assignment of compositional bias reveals universally prevalent biased regions: analysis of functional associations in human and Drosophila
title_full_unstemmed Exhaustive assignment of compositional bias reveals universally prevalent biased regions: analysis of functional associations in human and Drosophila
title_short Exhaustive assignment of compositional bias reveals universally prevalent biased regions: analysis of functional associations in human and Drosophila
title_sort exhaustive assignment of compositional bias reveals universally prevalent biased regions: analysis of functional associations in human and drosophila
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1618407/
https://www.ncbi.nlm.nih.gov/pubmed/17032452
http://dx.doi.org/10.1186/1471-2105-7-441
work_keys_str_mv AT harrisonpaulm exhaustiveassignmentofcompositionalbiasrevealsuniversallyprevalentbiasedregionsanalysisoffunctionalassociationsinhumananddrosophila