Cargando…
The 3of5 web application for complex and comprehensive pattern matching in protein sequences
BACKGROUND: The identification of patterns in biological sequences is a key challenge in genome analysis and in proteomics. Frequently such patterns are complex and highly variable, especially in protein sequences. They are frequently described using terms of regular expressions (RegEx) because of t...
Autores principales: | , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2006
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1523217/ https://www.ncbi.nlm.nih.gov/pubmed/16542452 http://dx.doi.org/10.1186/1471-2105-7-144 |
_version_ | 1782128802340536320 |
---|---|
author | Seiler, Markus Mehrle, Alexander Poustka, Annemarie Wiemann, Stefan |
author_facet | Seiler, Markus Mehrle, Alexander Poustka, Annemarie Wiemann, Stefan |
author_sort | Seiler, Markus |
collection | PubMed |
description | BACKGROUND: The identification of patterns in biological sequences is a key challenge in genome analysis and in proteomics. Frequently such patterns are complex and highly variable, especially in protein sequences. They are frequently described using terms of regular expressions (RegEx) because of the user-friendly terminology. Limitations arise for queries with the increasing complexity of patterns and are accompanied by requirements for enhanced capabilities. This is especially true for patterns containing ambiguous characters and positions and/or length ambiguities. RESULTS: We have implemented the 3of5 web application in order to enable complex pattern matching in protein sequences. 3of5 is named after a special use of its main feature, the novel n-of-m pattern type. This feature allows for an extensive specification of variable patterns where the individual elements may vary in their position, order, and content within a defined stretch of sequence. The number of distinct elements can be constrained by operators, and individual characters may be excluded. The n-of-m pattern type can be combined with common regular expression terms and thus also allows for a comprehensive description of complex patterns. 3of5 increases the fidelity of pattern matching and finds ALL possible solutions in protein sequences in cases of length-ambiguous patterns instead of simply reporting the longest or shortest hits. Grouping and combined search for patterns provides a hierarchical arrangement of larger patterns sets. The algorithm is implemented as internet application and freely accessible. The application is available at . CONCLUSION: The 3of5 application offers an extended vocabulary for the definition of search patterns and thus allows the user to comprehensively specify and identify peptide patterns with variable elements. The n-of-m pattern type offers an improved accuracy for pattern matching in combination with the ability to find all solutions, without compromising the user friendliness of regular expression terms. |
format | Text |
id | pubmed-1523217 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2006 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-15232172006-07-27 The 3of5 web application for complex and comprehensive pattern matching in protein sequences Seiler, Markus Mehrle, Alexander Poustka, Annemarie Wiemann, Stefan BMC Bioinformatics Software BACKGROUND: The identification of patterns in biological sequences is a key challenge in genome analysis and in proteomics. Frequently such patterns are complex and highly variable, especially in protein sequences. They are frequently described using terms of regular expressions (RegEx) because of the user-friendly terminology. Limitations arise for queries with the increasing complexity of patterns and are accompanied by requirements for enhanced capabilities. This is especially true for patterns containing ambiguous characters and positions and/or length ambiguities. RESULTS: We have implemented the 3of5 web application in order to enable complex pattern matching in protein sequences. 3of5 is named after a special use of its main feature, the novel n-of-m pattern type. This feature allows for an extensive specification of variable patterns where the individual elements may vary in their position, order, and content within a defined stretch of sequence. The number of distinct elements can be constrained by operators, and individual characters may be excluded. The n-of-m pattern type can be combined with common regular expression terms and thus also allows for a comprehensive description of complex patterns. 3of5 increases the fidelity of pattern matching and finds ALL possible solutions in protein sequences in cases of length-ambiguous patterns instead of simply reporting the longest or shortest hits. Grouping and combined search for patterns provides a hierarchical arrangement of larger patterns sets. The algorithm is implemented as internet application and freely accessible. The application is available at . CONCLUSION: The 3of5 application offers an extended vocabulary for the definition of search patterns and thus allows the user to comprehensively specify and identify peptide patterns with variable elements. The n-of-m pattern type offers an improved accuracy for pattern matching in combination with the ability to find all solutions, without compromising the user friendliness of regular expression terms. BioMed Central 2006-03-16 /pmc/articles/PMC1523217/ /pubmed/16542452 http://dx.doi.org/10.1186/1471-2105-7-144 Text en Copyright © 2006 Seiler et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Software Seiler, Markus Mehrle, Alexander Poustka, Annemarie Wiemann, Stefan The 3of5 web application for complex and comprehensive pattern matching in protein sequences |
title | The 3of5 web application for complex and comprehensive pattern matching in protein sequences |
title_full | The 3of5 web application for complex and comprehensive pattern matching in protein sequences |
title_fullStr | The 3of5 web application for complex and comprehensive pattern matching in protein sequences |
title_full_unstemmed | The 3of5 web application for complex and comprehensive pattern matching in protein sequences |
title_short | The 3of5 web application for complex and comprehensive pattern matching in protein sequences |
title_sort | 3of5 web application for complex and comprehensive pattern matching in protein sequences |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1523217/ https://www.ncbi.nlm.nih.gov/pubmed/16542452 http://dx.doi.org/10.1186/1471-2105-7-144 |
work_keys_str_mv | AT seilermarkus the3of5webapplicationforcomplexandcomprehensivepatternmatchinginproteinsequences AT mehrlealexander the3of5webapplicationforcomplexandcomprehensivepatternmatchinginproteinsequences AT poustkaannemarie the3of5webapplicationforcomplexandcomprehensivepatternmatchinginproteinsequences AT wiemannstefan the3of5webapplicationforcomplexandcomprehensivepatternmatchinginproteinsequences AT seilermarkus 3of5webapplicationforcomplexandcomprehensivepatternmatchinginproteinsequences AT mehrlealexander 3of5webapplicationforcomplexandcomprehensivepatternmatchinginproteinsequences AT poustkaannemarie 3of5webapplicationforcomplexandcomprehensivepatternmatchinginproteinsequences AT wiemannstefan 3of5webapplicationforcomplexandcomprehensivepatternmatchinginproteinsequences |