Cargando…

Profile-based short linear protein motif discovery

BACKGROUND: Short linear protein motifs are attracting increasing attention as functionally independent sites, typically 3–10 amino acids in length that are enriched in disordered regions of proteins. Multiple methods have recently been proposed to discover over-represented motifs within a set of pr...

Descripción completa

Detalles Bibliográficos
Autores principales: Haslam, Niall J, Shields, Denis C
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3534220/
https://www.ncbi.nlm.nih.gov/pubmed/22607209
http://dx.doi.org/10.1186/1471-2105-13-104
_version_ 1782475294606622720
author Haslam, Niall J
Shields, Denis C
author_facet Haslam, Niall J
Shields, Denis C
author_sort Haslam, Niall J
collection PubMed
description BACKGROUND: Short linear protein motifs are attracting increasing attention as functionally independent sites, typically 3–10 amino acids in length that are enriched in disordered regions of proteins. Multiple methods have recently been proposed to discover over-represented motifs within a set of proteins based on simple regular expressions. Here, we extend these approaches to profile-based methods, which provide a richer motif representation. RESULTS: The profile motif discovery method MEME performed relatively poorly for motifs in disordered regions of proteins. However, when we applied evolutionary weighting to account for redundancy amongst homologous proteins, and masked out poorly conserved regions of disordered proteins, the performance of MEME is equivalent to that of regular expression methods. However, the two approaches returned different subsets within both a benchmark dataset, and a more realistic discovery dataset. CONCLUSIONS: Profile-based motif discovery methods complement regular expression based methods. Whilst profile-based methods are computationally more intensive, they are likely to discover motifs currently overlooked by regular expression methods.
format Online
Article
Text
id pubmed-3534220
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35342202013-01-07 Profile-based short linear protein motif discovery Haslam, Niall J Shields, Denis C BMC Bioinformatics Methodology Article BACKGROUND: Short linear protein motifs are attracting increasing attention as functionally independent sites, typically 3–10 amino acids in length that are enriched in disordered regions of proteins. Multiple methods have recently been proposed to discover over-represented motifs within a set of proteins based on simple regular expressions. Here, we extend these approaches to profile-based methods, which provide a richer motif representation. RESULTS: The profile motif discovery method MEME performed relatively poorly for motifs in disordered regions of proteins. However, when we applied evolutionary weighting to account for redundancy amongst homologous proteins, and masked out poorly conserved regions of disordered proteins, the performance of MEME is equivalent to that of regular expression methods. However, the two approaches returned different subsets within both a benchmark dataset, and a more realistic discovery dataset. CONCLUSIONS: Profile-based motif discovery methods complement regular expression based methods. Whilst profile-based methods are computationally more intensive, they are likely to discover motifs currently overlooked by regular expression methods. BioMed Central 2012-05-18 /pmc/articles/PMC3534220/ /pubmed/22607209 http://dx.doi.org/10.1186/1471-2105-13-104 Text en Copyright ©2012 Haslam and Shields; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Haslam, Niall J
Shields, Denis C
Profile-based short linear protein motif discovery
title Profile-based short linear protein motif discovery
title_full Profile-based short linear protein motif discovery
title_fullStr Profile-based short linear protein motif discovery
title_full_unstemmed Profile-based short linear protein motif discovery
title_short Profile-based short linear protein motif discovery
title_sort profile-based short linear protein motif discovery
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3534220/
https://www.ncbi.nlm.nih.gov/pubmed/22607209
http://dx.doi.org/10.1186/1471-2105-13-104
work_keys_str_mv AT haslamniallj profilebasedshortlinearproteinmotifdiscovery
AT shieldsdenisc profilebasedshortlinearproteinmotifdiscovery