Cargando…

Beyond Microsatellite Instability: Intrinsic Disorder as a Potential Link Between Protein Short Tandem Repeats and Cancer

Short tandem repeats (STRs) are abundant in genomic sequences and are known for comparatively high mutation rates; STRs therefore are thought to be a potent source of genetic diversity. In protein-coding sequences STRs primarily encode disorder-promoting amino acids and are often located in intrinsi...

Descripción completa

Detalles Bibliográficos
Autores principales: Verbiest, Max A., Delucchi, Matteo, Bilgin Sonay, Tugce, Anisimova, Maria
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9581044/
https://www.ncbi.nlm.nih.gov/pubmed/36303757
http://dx.doi.org/10.3389/fbinf.2021.685844
_version_ 1784812529650761728
author Verbiest, Max A.
Delucchi, Matteo
Bilgin Sonay, Tugce
Anisimova, Maria
author_facet Verbiest, Max A.
Delucchi, Matteo
Bilgin Sonay, Tugce
Anisimova, Maria
author_sort Verbiest, Max A.
collection PubMed
description Short tandem repeats (STRs) are abundant in genomic sequences and are known for comparatively high mutation rates; STRs therefore are thought to be a potent source of genetic diversity. In protein-coding sequences STRs primarily encode disorder-promoting amino acids and are often located in intrinsically disordered regions (IDRs). STRs are frequently studied in the scope of microsatellite instability (MSI) in cancer, with little focus on the connection between protein STRs and IDRs. We believe, however, that this relationship should be explicitly included when ascertaining STR functionality in cancer. Here we explore this notion using all canonical human proteins from SwissProt, wherein we detected 3,699 STRs. Over 80% of these consisted completely of disorder promoting amino acids. 62.1% of amino acids in STR sequences were predicted to also be in an IDR, compared to 14.2% for non-repeat sequences. Over-representation analysis showed STR-containing proteins to be primarily located in the nucleus where they perform protein- and nucleotide-binding functions and regulate gene expression. They were also enriched in cancer-related signaling pathways. Furthermore, we found enrichments of STR-containing proteins among those correlated with patient survival for cancers derived from eight different anatomical sites. Intriguingly, several of these cancer types are not known to have a MSI-high (MSI-H) phenotype, suggesting that protein STRs play a role in cancer pathology in non MSI-H settings. Their intrinsic link with IDRs could therefore be an attractive topic of future research to further explore the role of STRs and IDRs in cancer. We speculate that our observations may be linked to the known dosage-sensitivity of disordered proteins, which could hint at a concentration-dependent gain-of-function mechanism in cancer for proteins containing STRs and IDRs.
format Online
Article
Text
id pubmed-9581044
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-95810442022-10-26 Beyond Microsatellite Instability: Intrinsic Disorder as a Potential Link Between Protein Short Tandem Repeats and Cancer Verbiest, Max A. Delucchi, Matteo Bilgin Sonay, Tugce Anisimova, Maria Front Bioinform Bioinformatics Short tandem repeats (STRs) are abundant in genomic sequences and are known for comparatively high mutation rates; STRs therefore are thought to be a potent source of genetic diversity. In protein-coding sequences STRs primarily encode disorder-promoting amino acids and are often located in intrinsically disordered regions (IDRs). STRs are frequently studied in the scope of microsatellite instability (MSI) in cancer, with little focus on the connection between protein STRs and IDRs. We believe, however, that this relationship should be explicitly included when ascertaining STR functionality in cancer. Here we explore this notion using all canonical human proteins from SwissProt, wherein we detected 3,699 STRs. Over 80% of these consisted completely of disorder promoting amino acids. 62.1% of amino acids in STR sequences were predicted to also be in an IDR, compared to 14.2% for non-repeat sequences. Over-representation analysis showed STR-containing proteins to be primarily located in the nucleus where they perform protein- and nucleotide-binding functions and regulate gene expression. They were also enriched in cancer-related signaling pathways. Furthermore, we found enrichments of STR-containing proteins among those correlated with patient survival for cancers derived from eight different anatomical sites. Intriguingly, several of these cancer types are not known to have a MSI-high (MSI-H) phenotype, suggesting that protein STRs play a role in cancer pathology in non MSI-H settings. Their intrinsic link with IDRs could therefore be an attractive topic of future research to further explore the role of STRs and IDRs in cancer. We speculate that our observations may be linked to the known dosage-sensitivity of disordered proteins, which could hint at a concentration-dependent gain-of-function mechanism in cancer for proteins containing STRs and IDRs. Frontiers Media S.A. 2021-06-08 /pmc/articles/PMC9581044/ /pubmed/36303757 http://dx.doi.org/10.3389/fbinf.2021.685844 Text en Copyright © 2021 Verbiest, Delucchi, Bilgin Sonay and Anisimova. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Bioinformatics
Verbiest, Max A.
Delucchi, Matteo
Bilgin Sonay, Tugce
Anisimova, Maria
Beyond Microsatellite Instability: Intrinsic Disorder as a Potential Link Between Protein Short Tandem Repeats and Cancer
title Beyond Microsatellite Instability: Intrinsic Disorder as a Potential Link Between Protein Short Tandem Repeats and Cancer
title_full Beyond Microsatellite Instability: Intrinsic Disorder as a Potential Link Between Protein Short Tandem Repeats and Cancer
title_fullStr Beyond Microsatellite Instability: Intrinsic Disorder as a Potential Link Between Protein Short Tandem Repeats and Cancer
title_full_unstemmed Beyond Microsatellite Instability: Intrinsic Disorder as a Potential Link Between Protein Short Tandem Repeats and Cancer
title_short Beyond Microsatellite Instability: Intrinsic Disorder as a Potential Link Between Protein Short Tandem Repeats and Cancer
title_sort beyond microsatellite instability: intrinsic disorder as a potential link between protein short tandem repeats and cancer
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9581044/
https://www.ncbi.nlm.nih.gov/pubmed/36303757
http://dx.doi.org/10.3389/fbinf.2021.685844
work_keys_str_mv AT verbiestmaxa beyondmicrosatelliteinstabilityintrinsicdisorderasapotentiallinkbetweenproteinshorttandemrepeatsandcancer
AT delucchimatteo beyondmicrosatelliteinstabilityintrinsicdisorderasapotentiallinkbetweenproteinshorttandemrepeatsandcancer
AT bilginsonaytugce beyondmicrosatelliteinstabilityintrinsicdisorderasapotentiallinkbetweenproteinshorttandemrepeatsandcancer
AT anisimovamaria beyondmicrosatelliteinstabilityintrinsicdisorderasapotentiallinkbetweenproteinshorttandemrepeatsandcancer