Cargando…

ProTG4: A Web Server to Approximate the Sequence of a Generic Protein From an in Silico Library of Translatable G-Quadruplex (TG4)-Mapped Peptides

An RNA G-quadruplex in the protein coding segment of mRNA is translatable [Formula: see text] and may potentially impact protein translation. This can be consequent to staggered ribosomal synthesis and/or result in an increased frequency of missense translational events. A mathematical model of the...

Descripción completa

Detalles Bibliográficos
Autor principal: Kundu, Siddhartha
Formato: Online Artículo Texto
Lenguaje:English
Publicado: SAGE Publications 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8482721/
https://www.ncbi.nlm.nih.gov/pubmed/34602814
http://dx.doi.org/10.1177/11779322211045878
_version_ 1784576969049899008
author Kundu, Siddhartha
author_facet Kundu, Siddhartha
author_sort Kundu, Siddhartha
collection PubMed
description An RNA G-quadruplex in the protein coding segment of mRNA is translatable [Formula: see text] and may potentially impact protein translation. This can be consequent to staggered ribosomal synthesis and/or result in an increased frequency of missense translational events. A mathematical model of the peptides that encompass the substituted amino acids, ie, the [Formula: see text]-mapped peptidome, has been previously studied. However, the significance and relevance to disease biology of this model remains to be established. ProTG4 computes a confidence-of-sequence-identity [Formula: see text]-score, which is the average weighted length of every matched [Formula: see text]-mapped peptide in a generic protein sequence. The weighted length is the product of the length of the peptide and the probability of its non-random occurrence in a library of randomly generated sequences of equivalent lengths. This is then averaged over the entire length of the protein sequence. ProTG4 is simple to operate, has clear instructions, and is accompanied by a set of ready-to-use examples. The rationale of the study, algorithms deployed, and the computational pipeline deployed are also part of the web page. Analyses by ProTG4 of taxonomically diverse protein sequences suggest that there is significant homology to [Formula: see text]-mapped peptides. These findings, especially in potentially infectious and infesting agents, offer plausible explanations into the aetiology and pathogenesis of certain proteopathies. ProTG4 can also provide a quantitative measure to identify and annotate the canonical form of a generic protein sequence from its known isoforms. The article presents several case studies and discusses the relevance of ProTG4-assisted peptide analysis in gaining insights into various mechanisms of disease biology (mistranslation, alternate splicing, amino acid substitutions).
format Online
Article
Text
id pubmed-8482721
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher SAGE Publications
record_format MEDLINE/PubMed
spelling pubmed-84827212021-10-01 ProTG4: A Web Server to Approximate the Sequence of a Generic Protein From an in Silico Library of Translatable G-Quadruplex (TG4)-Mapped Peptides Kundu, Siddhartha Bioinform Biol Insights Original Research An RNA G-quadruplex in the protein coding segment of mRNA is translatable [Formula: see text] and may potentially impact protein translation. This can be consequent to staggered ribosomal synthesis and/or result in an increased frequency of missense translational events. A mathematical model of the peptides that encompass the substituted amino acids, ie, the [Formula: see text]-mapped peptidome, has been previously studied. However, the significance and relevance to disease biology of this model remains to be established. ProTG4 computes a confidence-of-sequence-identity [Formula: see text]-score, which is the average weighted length of every matched [Formula: see text]-mapped peptide in a generic protein sequence. The weighted length is the product of the length of the peptide and the probability of its non-random occurrence in a library of randomly generated sequences of equivalent lengths. This is then averaged over the entire length of the protein sequence. ProTG4 is simple to operate, has clear instructions, and is accompanied by a set of ready-to-use examples. The rationale of the study, algorithms deployed, and the computational pipeline deployed are also part of the web page. Analyses by ProTG4 of taxonomically diverse protein sequences suggest that there is significant homology to [Formula: see text]-mapped peptides. These findings, especially in potentially infectious and infesting agents, offer plausible explanations into the aetiology and pathogenesis of certain proteopathies. ProTG4 can also provide a quantitative measure to identify and annotate the canonical form of a generic protein sequence from its known isoforms. The article presents several case studies and discusses the relevance of ProTG4-assisted peptide analysis in gaining insights into various mechanisms of disease biology (mistranslation, alternate splicing, amino acid substitutions). SAGE Publications 2021-09-28 /pmc/articles/PMC8482721/ /pubmed/34602814 http://dx.doi.org/10.1177/11779322211045878 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by-nc/4.0/This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (https://creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage).
spellingShingle Original Research
Kundu, Siddhartha
ProTG4: A Web Server to Approximate the Sequence of a Generic Protein From an in Silico Library of Translatable G-Quadruplex (TG4)-Mapped Peptides
title ProTG4: A Web Server to Approximate the Sequence of a Generic Protein From an in Silico Library of Translatable G-Quadruplex (TG4)-Mapped Peptides
title_full ProTG4: A Web Server to Approximate the Sequence of a Generic Protein From an in Silico Library of Translatable G-Quadruplex (TG4)-Mapped Peptides
title_fullStr ProTG4: A Web Server to Approximate the Sequence of a Generic Protein From an in Silico Library of Translatable G-Quadruplex (TG4)-Mapped Peptides
title_full_unstemmed ProTG4: A Web Server to Approximate the Sequence of a Generic Protein From an in Silico Library of Translatable G-Quadruplex (TG4)-Mapped Peptides
title_short ProTG4: A Web Server to Approximate the Sequence of a Generic Protein From an in Silico Library of Translatable G-Quadruplex (TG4)-Mapped Peptides
title_sort protg4: a web server to approximate the sequence of a generic protein from an in silico library of translatable g-quadruplex (tg4)-mapped peptides
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8482721/
https://www.ncbi.nlm.nih.gov/pubmed/34602814
http://dx.doi.org/10.1177/11779322211045878
work_keys_str_mv AT kundusiddhartha protg4awebservertoapproximatethesequenceofagenericproteinfromaninsilicolibraryoftranslatablegquadruplextg4mappedpeptides