Cargando…

The 3of5 web application for complex and comprehensive pattern matching in protein sequences

BACKGROUND: The identification of patterns in biological sequences is a key challenge in genome analysis and in proteomics. Frequently such patterns are complex and highly variable, especially in protein sequences. They are frequently described using terms of regular expressions (RegEx) because of t...

Descripción completa

Detalles Bibliográficos
Autores principales: Seiler, Markus, Mehrle, Alexander, Poustka, Annemarie, Wiemann, Stefan
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1523217/
https://www.ncbi.nlm.nih.gov/pubmed/16542452
http://dx.doi.org/10.1186/1471-2105-7-144
_version_ 1782128802340536320
author Seiler, Markus
Mehrle, Alexander
Poustka, Annemarie
Wiemann, Stefan
author_facet Seiler, Markus
Mehrle, Alexander
Poustka, Annemarie
Wiemann, Stefan
author_sort Seiler, Markus
collection PubMed
description BACKGROUND: The identification of patterns in biological sequences is a key challenge in genome analysis and in proteomics. Frequently such patterns are complex and highly variable, especially in protein sequences. They are frequently described using terms of regular expressions (RegEx) because of the user-friendly terminology. Limitations arise for queries with the increasing complexity of patterns and are accompanied by requirements for enhanced capabilities. This is especially true for patterns containing ambiguous characters and positions and/or length ambiguities. RESULTS: We have implemented the 3of5 web application in order to enable complex pattern matching in protein sequences. 3of5 is named after a special use of its main feature, the novel n-of-m pattern type. This feature allows for an extensive specification of variable patterns where the individual elements may vary in their position, order, and content within a defined stretch of sequence. The number of distinct elements can be constrained by operators, and individual characters may be excluded. The n-of-m pattern type can be combined with common regular expression terms and thus also allows for a comprehensive description of complex patterns. 3of5 increases the fidelity of pattern matching and finds ALL possible solutions in protein sequences in cases of length-ambiguous patterns instead of simply reporting the longest or shortest hits. Grouping and combined search for patterns provides a hierarchical arrangement of larger patterns sets. The algorithm is implemented as internet application and freely accessible. The application is available at . CONCLUSION: The 3of5 application offers an extended vocabulary for the definition of search patterns and thus allows the user to comprehensively specify and identify peptide patterns with variable elements. The n-of-m pattern type offers an improved accuracy for pattern matching in combination with the ability to find all solutions, without compromising the user friendliness of regular expression terms.
format Text
id pubmed-1523217
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-15232172006-07-27 The 3of5 web application for complex and comprehensive pattern matching in protein sequences Seiler, Markus Mehrle, Alexander Poustka, Annemarie Wiemann, Stefan BMC Bioinformatics Software BACKGROUND: The identification of patterns in biological sequences is a key challenge in genome analysis and in proteomics. Frequently such patterns are complex and highly variable, especially in protein sequences. They are frequently described using terms of regular expressions (RegEx) because of the user-friendly terminology. Limitations arise for queries with the increasing complexity of patterns and are accompanied by requirements for enhanced capabilities. This is especially true for patterns containing ambiguous characters and positions and/or length ambiguities. RESULTS: We have implemented the 3of5 web application in order to enable complex pattern matching in protein sequences. 3of5 is named after a special use of its main feature, the novel n-of-m pattern type. This feature allows for an extensive specification of variable patterns where the individual elements may vary in their position, order, and content within a defined stretch of sequence. The number of distinct elements can be constrained by operators, and individual characters may be excluded. The n-of-m pattern type can be combined with common regular expression terms and thus also allows for a comprehensive description of complex patterns. 3of5 increases the fidelity of pattern matching and finds ALL possible solutions in protein sequences in cases of length-ambiguous patterns instead of simply reporting the longest or shortest hits. Grouping and combined search for patterns provides a hierarchical arrangement of larger patterns sets. The algorithm is implemented as internet application and freely accessible. The application is available at . CONCLUSION: The 3of5 application offers an extended vocabulary for the definition of search patterns and thus allows the user to comprehensively specify and identify peptide patterns with variable elements. The n-of-m pattern type offers an improved accuracy for pattern matching in combination with the ability to find all solutions, without compromising the user friendliness of regular expression terms. BioMed Central 2006-03-16 /pmc/articles/PMC1523217/ /pubmed/16542452 http://dx.doi.org/10.1186/1471-2105-7-144 Text en Copyright © 2006 Seiler et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Seiler, Markus
Mehrle, Alexander
Poustka, Annemarie
Wiemann, Stefan
The 3of5 web application for complex and comprehensive pattern matching in protein sequences
title The 3of5 web application for complex and comprehensive pattern matching in protein sequences
title_full The 3of5 web application for complex and comprehensive pattern matching in protein sequences
title_fullStr The 3of5 web application for complex and comprehensive pattern matching in protein sequences
title_full_unstemmed The 3of5 web application for complex and comprehensive pattern matching in protein sequences
title_short The 3of5 web application for complex and comprehensive pattern matching in protein sequences
title_sort 3of5 web application for complex and comprehensive pattern matching in protein sequences
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1523217/
https://www.ncbi.nlm.nih.gov/pubmed/16542452
http://dx.doi.org/10.1186/1471-2105-7-144
work_keys_str_mv AT seilermarkus the3of5webapplicationforcomplexandcomprehensivepatternmatchinginproteinsequences
AT mehrlealexander the3of5webapplicationforcomplexandcomprehensivepatternmatchinginproteinsequences
AT poustkaannemarie the3of5webapplicationforcomplexandcomprehensivepatternmatchinginproteinsequences
AT wiemannstefan the3of5webapplicationforcomplexandcomprehensivepatternmatchinginproteinsequences
AT seilermarkus 3of5webapplicationforcomplexandcomprehensivepatternmatchinginproteinsequences
AT mehrlealexander 3of5webapplicationforcomplexandcomprehensivepatternmatchinginproteinsequences
AT poustkaannemarie 3of5webapplicationforcomplexandcomprehensivepatternmatchinginproteinsequences
AT wiemannstefan 3of5webapplicationforcomplexandcomprehensivepatternmatchinginproteinsequences