Cargando…

Disentangling the complexity of low complexity proteins

There are multiple definitions for low complexity regions (LCRs) in protein sequences, with all of them broadly considering LCRs as regions with fewer amino acid types compared to an average composition. Following this view, LCRs can also be defined as regions showing composition bias. In this criti...

Descripción completa

Detalles Bibliográficos
Autores principales: Mier, Pablo, Paladin, Lisanna, Tamana, Stella, Petrosian, Sophia, Hajdu-Soltész, Borbála, Urbanek, Annika, Gruca, Aleksandra, Plewczynski, Dariusz, Grynberg, Marcin, Bernadó, Pau, Gáspári, Zoltán, Ouzounis, Christos A, Promponas, Vasilis J, Kajava, Andrey V, Hancock, John M, Tosatto, Silvio C E, Dosztanyi, Zsuzsanna, Andrade-Navarro, Miguel A
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7299295/
https://www.ncbi.nlm.nih.gov/pubmed/30698641
http://dx.doi.org/10.1093/bib/bbz007
_version_ 1783547358025351168
author Mier, Pablo
Paladin, Lisanna
Tamana, Stella
Petrosian, Sophia
Hajdu-Soltész, Borbála
Urbanek, Annika
Gruca, Aleksandra
Plewczynski, Dariusz
Grynberg, Marcin
Bernadó, Pau
Gáspári, Zoltán
Ouzounis, Christos A
Promponas, Vasilis J
Kajava, Andrey V
Hancock, John M
Tosatto, Silvio C E
Dosztanyi, Zsuzsanna
Andrade-Navarro, Miguel A
author_facet Mier, Pablo
Paladin, Lisanna
Tamana, Stella
Petrosian, Sophia
Hajdu-Soltész, Borbála
Urbanek, Annika
Gruca, Aleksandra
Plewczynski, Dariusz
Grynberg, Marcin
Bernadó, Pau
Gáspári, Zoltán
Ouzounis, Christos A
Promponas, Vasilis J
Kajava, Andrey V
Hancock, John M
Tosatto, Silvio C E
Dosztanyi, Zsuzsanna
Andrade-Navarro, Miguel A
author_sort Mier, Pablo
collection PubMed
description There are multiple definitions for low complexity regions (LCRs) in protein sequences, with all of them broadly considering LCRs as regions with fewer amino acid types compared to an average composition. Following this view, LCRs can also be defined as regions showing composition bias. In this critical review, we focus on the definition of sequence complexity of LCRs and their connection with structure. We present statistics and methodological approaches that measure low complexity (LC) and related sequence properties. Composition bias is often associated with LC and disorder, but repeats, while compositionally biased, might also induce ordered structures. We illustrate this dichotomy, and more generally the overlaps between different properties related to LCRs, using examples. We argue that statistical measures alone cannot capture all structural aspects of LCRs and recommend the combined usage of a variety of predictive tools and measurements. While the methodologies available to study LCRs are already very advanced, we foresee that a more comprehensive annotation of sequences in the databases will enable the improvement of predictions and a better understanding of the evolution and the connection between structure and function of LCRs. This will require the use of standards for the generation and exchange of data describing all aspects of LCRs. SHORT ABSTRACT: There are multiple definitions for low complexity regions (LCRs) in protein sequences. In this critical review, we focus on the definition of sequence complexity of LCRs and their connection with structure. We present statistics and methodological approaches that measure low complexity (LC) and related sequence properties. Composition bias is often associated with LC and disorder, but repeats, while compositionally biased, might also induce ordered structures. We illustrate this dichotomy, plus overlaps between different properties related to LCRs, using examples.
format Online
Article
Text
id pubmed-7299295
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-72992952020-06-22 Disentangling the complexity of low complexity proteins Mier, Pablo Paladin, Lisanna Tamana, Stella Petrosian, Sophia Hajdu-Soltész, Borbála Urbanek, Annika Gruca, Aleksandra Plewczynski, Dariusz Grynberg, Marcin Bernadó, Pau Gáspári, Zoltán Ouzounis, Christos A Promponas, Vasilis J Kajava, Andrey V Hancock, John M Tosatto, Silvio C E Dosztanyi, Zsuzsanna Andrade-Navarro, Miguel A Brief Bioinform Review Article There are multiple definitions for low complexity regions (LCRs) in protein sequences, with all of them broadly considering LCRs as regions with fewer amino acid types compared to an average composition. Following this view, LCRs can also be defined as regions showing composition bias. In this critical review, we focus on the definition of sequence complexity of LCRs and their connection with structure. We present statistics and methodological approaches that measure low complexity (LC) and related sequence properties. Composition bias is often associated with LC and disorder, but repeats, while compositionally biased, might also induce ordered structures. We illustrate this dichotomy, and more generally the overlaps between different properties related to LCRs, using examples. We argue that statistical measures alone cannot capture all structural aspects of LCRs and recommend the combined usage of a variety of predictive tools and measurements. While the methodologies available to study LCRs are already very advanced, we foresee that a more comprehensive annotation of sequences in the databases will enable the improvement of predictions and a better understanding of the evolution and the connection between structure and function of LCRs. This will require the use of standards for the generation and exchange of data describing all aspects of LCRs. SHORT ABSTRACT: There are multiple definitions for low complexity regions (LCRs) in protein sequences. In this critical review, we focus on the definition of sequence complexity of LCRs and their connection with structure. We present statistics and methodological approaches that measure low complexity (LC) and related sequence properties. Composition bias is often associated with LC and disorder, but repeats, while compositionally biased, might also induce ordered structures. We illustrate this dichotomy, plus overlaps between different properties related to LCRs, using examples. Oxford University Press 2019-01-30 /pmc/articles/PMC7299295/ /pubmed/30698641 http://dx.doi.org/10.1093/bib/bbz007 Text en © The Author(s) 2019. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Review Article
Mier, Pablo
Paladin, Lisanna
Tamana, Stella
Petrosian, Sophia
Hajdu-Soltész, Borbála
Urbanek, Annika
Gruca, Aleksandra
Plewczynski, Dariusz
Grynberg, Marcin
Bernadó, Pau
Gáspári, Zoltán
Ouzounis, Christos A
Promponas, Vasilis J
Kajava, Andrey V
Hancock, John M
Tosatto, Silvio C E
Dosztanyi, Zsuzsanna
Andrade-Navarro, Miguel A
Disentangling the complexity of low complexity proteins
title Disentangling the complexity of low complexity proteins
title_full Disentangling the complexity of low complexity proteins
title_fullStr Disentangling the complexity of low complexity proteins
title_full_unstemmed Disentangling the complexity of low complexity proteins
title_short Disentangling the complexity of low complexity proteins
title_sort disentangling the complexity of low complexity proteins
topic Review Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7299295/
https://www.ncbi.nlm.nih.gov/pubmed/30698641
http://dx.doi.org/10.1093/bib/bbz007
work_keys_str_mv AT mierpablo disentanglingthecomplexityoflowcomplexityproteins
AT paladinlisanna disentanglingthecomplexityoflowcomplexityproteins
AT tamanastella disentanglingthecomplexityoflowcomplexityproteins
AT petrosiansophia disentanglingthecomplexityoflowcomplexityproteins
AT hajdusolteszborbala disentanglingthecomplexityoflowcomplexityproteins
AT urbanekannika disentanglingthecomplexityoflowcomplexityproteins
AT grucaaleksandra disentanglingthecomplexityoflowcomplexityproteins
AT plewczynskidariusz disentanglingthecomplexityoflowcomplexityproteins
AT grynbergmarcin disentanglingthecomplexityoflowcomplexityproteins
AT bernadopau disentanglingthecomplexityoflowcomplexityproteins
AT gasparizoltan disentanglingthecomplexityoflowcomplexityproteins
AT ouzounischristosa disentanglingthecomplexityoflowcomplexityproteins
AT promponasvasilisj disentanglingthecomplexityoflowcomplexityproteins
AT kajavaandreyv disentanglingthecomplexityoflowcomplexityproteins
AT hancockjohnm disentanglingthecomplexityoflowcomplexityproteins
AT tosattosilvioce disentanglingthecomplexityoflowcomplexityproteins
AT dosztanyizsuzsanna disentanglingthecomplexityoflowcomplexityproteins
AT andradenavarromiguela disentanglingthecomplexityoflowcomplexityproteins