Cargando…

RepSeq – A database of amino acid repeats present in lower eukaryotic pathogens

BACKGROUND: Amino acid repeat-containing proteins have a broad range of functions and their identification is of relevance to many experimental biologists. In human-infective protozoan parasites (such as the Kinetoplastid and Plasmodium species), they are implicated in immune evasion and have been s...

Descripción completa

Detalles Bibliográficos
Autores principales:	Depledge, Daniel P, Lower, Ryan PJ, Smith, Deborah F
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2007
Materias:	Database
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1854910/ https://www.ncbi.nlm.nih.gov/pubmed/17428323 http://dx.doi.org/10.1186/1471-2105-8-122

_version_	1782133125602607104
author	Depledge, Daniel P Lower, Ryan PJ Smith, Deborah F
author_facet	Depledge, Daniel P Lower, Ryan PJ Smith, Deborah F
author_sort	Depledge, Daniel P
collection	PubMed
description	BACKGROUND: Amino acid repeat-containing proteins have a broad range of functions and their identification is of relevance to many experimental biologists. In human-infective protozoan parasites (such as the Kinetoplastid and Plasmodium species), they are implicated in immune evasion and have been shown to influence virulence and pathogenicity. RepSeq is a new database of amino acid repeat-containing proteins found in lower eukaryotic pathogens. The RepSeq database is accessed via a web-based application which also provides links to related online tools and databases for further analyses. RESULTS: The RepSeq algorithm typically identifies more than 98% of repeat-containing proteins and is capable of identifying both perfect and mismatch repeats. The proportion of proteins that contain repeat elements varies greatly between different families and even species (3–35% of the total protein content). The most common motif type is the Sequence Repeat Region (SRR) – a repeated motif containing multiple different amino acid types. Proteins containing Single Amino Acid Repeats (SAARs) and Di-Peptide Repeats (DPRs) typically account for 0.5–1.0% of the total protein number. Notable exceptions are P. falciparum and D. discoideum, in which 33.67% and 34.28% respectively of the predicted proteomes consist of repeat-containing proteins. These numbers are due to large insertions of low complexity single and multi-codon repeat regions. CONCLUSION: The RepSeq database provides a repository for repeat-containing proteins found in parasitic protozoa. The database allows for both individual and cross-species proteome analyses and also allows users to upload sequences of interest for analysis by the RepSeq algorithm. Identification of repeat-containing proteins provides researchers with a defined subset of proteins which can be analysed by expression profiling and functional characterisation, thereby facilitating study of pathogenicity and virulence factors in the parasitic protozoa. While primarily designed for kinetoplastid work, the RepSeq algorithm and database retain full functionality when used to analyse other species.
format	Text
id	pubmed-1854910
institution	National Center for Biotechnology Information
language	English
publishDate	2007
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-18549102007-04-21 RepSeq – A database of amino acid repeats present in lower eukaryotic pathogens Depledge, Daniel P Lower, Ryan PJ Smith, Deborah F BMC Bioinformatics Database BACKGROUND: Amino acid repeat-containing proteins have a broad range of functions and their identification is of relevance to many experimental biologists. In human-infective protozoan parasites (such as the Kinetoplastid and Plasmodium species), they are implicated in immune evasion and have been shown to influence virulence and pathogenicity. RepSeq is a new database of amino acid repeat-containing proteins found in lower eukaryotic pathogens. The RepSeq database is accessed via a web-based application which also provides links to related online tools and databases for further analyses. RESULTS: The RepSeq algorithm typically identifies more than 98% of repeat-containing proteins and is capable of identifying both perfect and mismatch repeats. The proportion of proteins that contain repeat elements varies greatly between different families and even species (3–35% of the total protein content). The most common motif type is the Sequence Repeat Region (SRR) – a repeated motif containing multiple different amino acid types. Proteins containing Single Amino Acid Repeats (SAARs) and Di-Peptide Repeats (DPRs) typically account for 0.5–1.0% of the total protein number. Notable exceptions are P. falciparum and D. discoideum, in which 33.67% and 34.28% respectively of the predicted proteomes consist of repeat-containing proteins. These numbers are due to large insertions of low complexity single and multi-codon repeat regions. CONCLUSION: The RepSeq database provides a repository for repeat-containing proteins found in parasitic protozoa. The database allows for both individual and cross-species proteome analyses and also allows users to upload sequences of interest for analysis by the RepSeq algorithm. Identification of repeat-containing proteins provides researchers with a defined subset of proteins which can be analysed by expression profiling and functional characterisation, thereby facilitating study of pathogenicity and virulence factors in the parasitic protozoa. While primarily designed for kinetoplastid work, the RepSeq algorithm and database retain full functionality when used to analyse other species. BioMed Central 2007-04-11 /pmc/articles/PMC1854910/ /pubmed/17428323 http://dx.doi.org/10.1186/1471-2105-8-122 Text en Copyright © 2007 Depledge et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Database Depledge, Daniel P Lower, Ryan PJ Smith, Deborah F RepSeq – A database of amino acid repeats present in lower eukaryotic pathogens
title	RepSeq – A database of amino acid repeats present in lower eukaryotic pathogens
title_full	RepSeq – A database of amino acid repeats present in lower eukaryotic pathogens
title_fullStr	RepSeq – A database of amino acid repeats present in lower eukaryotic pathogens
title_full_unstemmed	RepSeq – A database of amino acid repeats present in lower eukaryotic pathogens
title_short	RepSeq – A database of amino acid repeats present in lower eukaryotic pathogens
title_sort	repseq – a database of amino acid repeats present in lower eukaryotic pathogens
topic	Database
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1854910/ https://www.ncbi.nlm.nih.gov/pubmed/17428323 http://dx.doi.org/10.1186/1471-2105-8-122
work_keys_str_mv	AT depledgedanielp repseqadatabaseofaminoacidrepeatspresentinlowereukaryoticpathogens AT lowerryanpj repseqadatabaseofaminoacidrepeatspresentinlowereukaryoticpathogens AT smithdeborahf repseqadatabaseofaminoacidrepeatspresentinlowereukaryoticpathogens

RepSeq – A database of amino acid repeats present in lower eukaryotic pathogens

Ejemplares similares