Cargando…

Disease association and comparative genomics of compositional bias in human proteins

Background: The evolutionary rate of disordered protein regions varies greatly due to the lack of structural constraints. So far, few studies have investigated the presence/absence patterns of compositional bias, indicative of disorder, across phylogenies in conjunction with human disease. In this s...

Descripción completa

Detalles Bibliográficos
Autores principales: Kouros, Christos E., Makri, Vasiliki, Ouzounis, Christos A., Chasapi, Anastasia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: F1000 Research Limited 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10111144/
https://www.ncbi.nlm.nih.gov/pubmed/37082000
http://dx.doi.org/10.12688/f1000research.129929.2
_version_ 1785027399435419648
author Kouros, Christos E.
Makri, Vasiliki
Ouzounis, Christos A.
Chasapi, Anastasia
author_facet Kouros, Christos E.
Makri, Vasiliki
Ouzounis, Christos A.
Chasapi, Anastasia
author_sort Kouros, Christos E.
collection PubMed
description Background: The evolutionary rate of disordered protein regions varies greatly due to the lack of structural constraints. So far, few studies have investigated the presence/absence patterns of compositional bias, indicative of disorder, across phylogenies in conjunction with human disease. In this study, we report a genome-wide analysis of compositional bias association with disease in human proteins and their taxonomic distribution. Methods: The human genome protein set provided by the Ensembl database was annotated and analysed with respect to both disease associations and the detection of compositional bias. The Uniprot Reference Proteome dataset, containing 11297 proteomes was used as target dataset for the comparative genomics of a well-defined subset of the Human Genome, including 100 characteristic, compositionally biased proteins, some linked to disease. Results: Cross-evaluation of compositional bias and disease-association in the human genome reveals a significant bias towards biased regions in disease-associated genes, with charged, hydrophilic amino acids appearing as over-represented. The phylogenetic profiling of 17 disease-associated, proteins with compositional bias across 11297 proteomes captures characteristic taxonomic distribution patterns. Conclusions: This is the first time that a combined genome-wide analysis of compositional bias, disease-association and taxonomic distribution of human proteins is reported, covering structural, functional, and evolutionary properties. The reported framework can form the basis for large-scale, follow-up projects, encompassing the entire human genome and all known gene-disease associations.
format Online
Article
Text
id pubmed-10111144
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher F1000 Research Limited
record_format MEDLINE/PubMed
spelling pubmed-101111442023-04-19 Disease association and comparative genomics of compositional bias in human proteins Kouros, Christos E. Makri, Vasiliki Ouzounis, Christos A. Chasapi, Anastasia F1000Res Research Article Background: The evolutionary rate of disordered protein regions varies greatly due to the lack of structural constraints. So far, few studies have investigated the presence/absence patterns of compositional bias, indicative of disorder, across phylogenies in conjunction with human disease. In this study, we report a genome-wide analysis of compositional bias association with disease in human proteins and their taxonomic distribution. Methods: The human genome protein set provided by the Ensembl database was annotated and analysed with respect to both disease associations and the detection of compositional bias. The Uniprot Reference Proteome dataset, containing 11297 proteomes was used as target dataset for the comparative genomics of a well-defined subset of the Human Genome, including 100 characteristic, compositionally biased proteins, some linked to disease. Results: Cross-evaluation of compositional bias and disease-association in the human genome reveals a significant bias towards biased regions in disease-associated genes, with charged, hydrophilic amino acids appearing as over-represented. The phylogenetic profiling of 17 disease-associated, proteins with compositional bias across 11297 proteomes captures characteristic taxonomic distribution patterns. Conclusions: This is the first time that a combined genome-wide analysis of compositional bias, disease-association and taxonomic distribution of human proteins is reported, covering structural, functional, and evolutionary properties. The reported framework can form the basis for large-scale, follow-up projects, encompassing the entire human genome and all known gene-disease associations. F1000 Research Limited 2023-04-14 /pmc/articles/PMC10111144/ /pubmed/37082000 http://dx.doi.org/10.12688/f1000research.129929.2 Text en Copyright: © 2023 Kouros CE et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Kouros, Christos E.
Makri, Vasiliki
Ouzounis, Christos A.
Chasapi, Anastasia
Disease association and comparative genomics of compositional bias in human proteins
title Disease association and comparative genomics of compositional bias in human proteins
title_full Disease association and comparative genomics of compositional bias in human proteins
title_fullStr Disease association and comparative genomics of compositional bias in human proteins
title_full_unstemmed Disease association and comparative genomics of compositional bias in human proteins
title_short Disease association and comparative genomics of compositional bias in human proteins
title_sort disease association and comparative genomics of compositional bias in human proteins
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10111144/
https://www.ncbi.nlm.nih.gov/pubmed/37082000
http://dx.doi.org/10.12688/f1000research.129929.2
work_keys_str_mv AT kouroschristose diseaseassociationandcomparativegenomicsofcompositionalbiasinhumanproteins
AT makrivasiliki diseaseassociationandcomparativegenomicsofcompositionalbiasinhumanproteins
AT ouzounischristosa diseaseassociationandcomparativegenomicsofcompositionalbiasinhumanproteins
AT chasapianastasia diseaseassociationandcomparativegenomicsofcompositionalbiasinhumanproteins