Cargando…

PathFams: statistical detection of pathogen-associated protein domains

BACKGROUND: A substantial fraction of genes identified within bacterial genomes encode proteins of unknown function. Identifying which of these proteins represent potential virulence factors, and mapping their key virulence determinants, is a challenging but important goal. RESULTS: To facilitate vi...

Descripción completa

Detalles Bibliográficos
Autores principales: Lobb, Briallen, Tremblay, Benjamin Jean-Marie, Moreno-Hagelsieb, Gabriel, Doxey, Andrew C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8442362/
https://www.ncbi.nlm.nih.gov/pubmed/34521345
http://dx.doi.org/10.1186/s12864-021-07982-8
Descripción
Sumario:BACKGROUND: A substantial fraction of genes identified within bacterial genomes encode proteins of unknown function. Identifying which of these proteins represent potential virulence factors, and mapping their key virulence determinants, is a challenging but important goal. RESULTS: To facilitate virulence factor discovery, we performed a comprehensive analysis of 17,929 protein domain families within the Pfam database, and scored them based on their overrepresentation in pathogenic versus non-pathogenic species, taxonomic distribution, relative abundance in metagenomic datasets, and other factors. CONCLUSIONS: We identify pathogen-associated domain families, candidate virulence factors in the human gut, and eukaryotic-like mimicry domains with likely roles in virulence. Furthermore, we provide an interactive database called PathFams to allow users to explore pathogen-associated domains as well as identify pathogen-associated domains and domain architectures in user-uploaded sequences of interest. PathFams is freely available at https://pathfams.uwaterloo.ca. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-021-07982-8.