Cargando…

PolyX2: Fast Detection of Homorepeats in Large Protein Datasets

Homorepeat sequences, consecutive runs of identical amino acids, are prevalent in eukaryotic proteins. It has become necessary to annotate and evaluate this feature in entire proteomes. The definition of what constitutes a homorepeat is not fixed, and different research approaches may require differ...

Descripción completa

Detalles Bibliográficos
Autores principales: Mier, Pablo, Andrade-Navarro, Miguel A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9141109/
https://www.ncbi.nlm.nih.gov/pubmed/35627143
http://dx.doi.org/10.3390/genes13050758
_version_ 1784715263875219456
author Mier, Pablo
Andrade-Navarro, Miguel A.
author_facet Mier, Pablo
Andrade-Navarro, Miguel A.
author_sort Mier, Pablo
collection PubMed
description Homorepeat sequences, consecutive runs of identical amino acids, are prevalent in eukaryotic proteins. It has become necessary to annotate and evaluate this feature in entire proteomes. The definition of what constitutes a homorepeat is not fixed, and different research approaches may require different definitions; therefore, flexible approaches to analyze homorepeats in complete proteomes are needed. Here, we present polyX2, a fast, simple but tunable script to scan protein datasets for all possible homorepeats. The user can modify the length of the window to scan, the minimum number of identical residues that must be found in the window, and the types of homorepeats to be found.
format Online
Article
Text
id pubmed-9141109
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-91411092022-05-28 PolyX2: Fast Detection of Homorepeats in Large Protein Datasets Mier, Pablo Andrade-Navarro, Miguel A. Genes (Basel) Brief Report Homorepeat sequences, consecutive runs of identical amino acids, are prevalent in eukaryotic proteins. It has become necessary to annotate and evaluate this feature in entire proteomes. The definition of what constitutes a homorepeat is not fixed, and different research approaches may require different definitions; therefore, flexible approaches to analyze homorepeats in complete proteomes are needed. Here, we present polyX2, a fast, simple but tunable script to scan protein datasets for all possible homorepeats. The user can modify the length of the window to scan, the minimum number of identical residues that must be found in the window, and the types of homorepeats to be found. MDPI 2022-04-25 /pmc/articles/PMC9141109/ /pubmed/35627143 http://dx.doi.org/10.3390/genes13050758 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Brief Report
Mier, Pablo
Andrade-Navarro, Miguel A.
PolyX2: Fast Detection of Homorepeats in Large Protein Datasets
title PolyX2: Fast Detection of Homorepeats in Large Protein Datasets
title_full PolyX2: Fast Detection of Homorepeats in Large Protein Datasets
title_fullStr PolyX2: Fast Detection of Homorepeats in Large Protein Datasets
title_full_unstemmed PolyX2: Fast Detection of Homorepeats in Large Protein Datasets
title_short PolyX2: Fast Detection of Homorepeats in Large Protein Datasets
title_sort polyx2: fast detection of homorepeats in large protein datasets
topic Brief Report
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9141109/
https://www.ncbi.nlm.nih.gov/pubmed/35627143
http://dx.doi.org/10.3390/genes13050758
work_keys_str_mv AT mierpablo polyx2fastdetectionofhomorepeatsinlargeproteindatasets
AT andradenavarromiguela polyx2fastdetectionofhomorepeatsinlargeproteindatasets