Cargando…

A tree-based conservation scoring method for short linear motifs in multiple alignments of protein sequences

BACKGROUND: The structure of many eukaryotic cell regulatory proteins is highly modular. They are assembled from globular domains, segments of natively disordered polypeptides and short linear motifs. The latter are involved in protein interactions and formation of regulatory complexes. The function...

Descripción completa

Detalles Bibliográficos
Autores principales: Chica, Claudia, Labarga, Alberto, Gould, Cathryn M, López, Rodrigo, Gibson, Toby J
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2396637/
https://www.ncbi.nlm.nih.gov/pubmed/18460207
http://dx.doi.org/10.1186/1471-2105-9-229
_version_ 1782155581932437504
author Chica, Claudia
Labarga, Alberto
Gould, Cathryn M
López, Rodrigo
Gibson, Toby J
author_facet Chica, Claudia
Labarga, Alberto
Gould, Cathryn M
López, Rodrigo
Gibson, Toby J
author_sort Chica, Claudia
collection PubMed
description BACKGROUND: The structure of many eukaryotic cell regulatory proteins is highly modular. They are assembled from globular domains, segments of natively disordered polypeptides and short linear motifs. The latter are involved in protein interactions and formation of regulatory complexes. The function of such proteins, which may be difficult to define, is the aggregate of the subfunctions of the modules. It is therefore desirable to efficiently predict linear motifs with some degree of accuracy, yet sequence database searches return results that are not significant. RESULTS: We have developed a method for scoring the conservation of linear motif instances. It requires only primary sequence-derived information (e.g. multiple alignment and sequence tree) and takes into account the degenerate nature of linear motif patterns. On our benchmarking, the method accurately scores 86% of the known positive instances, while distinguishing them from random matches in 78% of the cases. The conservation score is implemented as a real time application designed to be integrated into other tools. It is currently accessible via a Web Service or through a graphical interface. CONCLUSION: The conservation score improves the prediction of linear motifs, by discarding those matches that are unlikely to be functional because they have not been conserved during the evolution of the protein sequences. It is especially useful for instances in non-structured regions of the proteins, where a domain masking filtering strategy is not applicable.
format Text
id pubmed-2396637
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-23966372008-05-28 A tree-based conservation scoring method for short linear motifs in multiple alignments of protein sequences Chica, Claudia Labarga, Alberto Gould, Cathryn M López, Rodrigo Gibson, Toby J BMC Bioinformatics Methodology Article BACKGROUND: The structure of many eukaryotic cell regulatory proteins is highly modular. They are assembled from globular domains, segments of natively disordered polypeptides and short linear motifs. The latter are involved in protein interactions and formation of regulatory complexes. The function of such proteins, which may be difficult to define, is the aggregate of the subfunctions of the modules. It is therefore desirable to efficiently predict linear motifs with some degree of accuracy, yet sequence database searches return results that are not significant. RESULTS: We have developed a method for scoring the conservation of linear motif instances. It requires only primary sequence-derived information (e.g. multiple alignment and sequence tree) and takes into account the degenerate nature of linear motif patterns. On our benchmarking, the method accurately scores 86% of the known positive instances, while distinguishing them from random matches in 78% of the cases. The conservation score is implemented as a real time application designed to be integrated into other tools. It is currently accessible via a Web Service or through a graphical interface. CONCLUSION: The conservation score improves the prediction of linear motifs, by discarding those matches that are unlikely to be functional because they have not been conserved during the evolution of the protein sequences. It is especially useful for instances in non-structured regions of the proteins, where a domain masking filtering strategy is not applicable. BioMed Central 2008-05-06 /pmc/articles/PMC2396637/ /pubmed/18460207 http://dx.doi.org/10.1186/1471-2105-9-229 Text en Copyright © 2008 Chica et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Chica, Claudia
Labarga, Alberto
Gould, Cathryn M
López, Rodrigo
Gibson, Toby J
A tree-based conservation scoring method for short linear motifs in multiple alignments of protein sequences
title A tree-based conservation scoring method for short linear motifs in multiple alignments of protein sequences
title_full A tree-based conservation scoring method for short linear motifs in multiple alignments of protein sequences
title_fullStr A tree-based conservation scoring method for short linear motifs in multiple alignments of protein sequences
title_full_unstemmed A tree-based conservation scoring method for short linear motifs in multiple alignments of protein sequences
title_short A tree-based conservation scoring method for short linear motifs in multiple alignments of protein sequences
title_sort tree-based conservation scoring method for short linear motifs in multiple alignments of protein sequences
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2396637/
https://www.ncbi.nlm.nih.gov/pubmed/18460207
http://dx.doi.org/10.1186/1471-2105-9-229
work_keys_str_mv AT chicaclaudia atreebasedconservationscoringmethodforshortlinearmotifsinmultiplealignmentsofproteinsequences
AT labargaalberto atreebasedconservationscoringmethodforshortlinearmotifsinmultiplealignmentsofproteinsequences
AT gouldcathrynm atreebasedconservationscoringmethodforshortlinearmotifsinmultiplealignmentsofproteinsequences
AT lopezrodrigo atreebasedconservationscoringmethodforshortlinearmotifsinmultiplealignmentsofproteinsequences
AT gibsontobyj atreebasedconservationscoringmethodforshortlinearmotifsinmultiplealignmentsofproteinsequences
AT chicaclaudia treebasedconservationscoringmethodforshortlinearmotifsinmultiplealignmentsofproteinsequences
AT labargaalberto treebasedconservationscoringmethodforshortlinearmotifsinmultiplealignmentsofproteinsequences
AT gouldcathrynm treebasedconservationscoringmethodforshortlinearmotifsinmultiplealignmentsofproteinsequences
AT lopezrodrigo treebasedconservationscoringmethodforshortlinearmotifsinmultiplealignmentsofproteinsequences
AT gibsontobyj treebasedconservationscoringmethodforshortlinearmotifsinmultiplealignmentsofproteinsequences