Cargando…

tacg – a grep for DNA

BACKGROUND: Pattern matching is the core of bioinformatics; it is used in database searching, restriction enzyme mapping, and finding open reading frames. It is done repeatedly over increasingly long sequences, thus codes must be efficient and insensitive to sequence length. Such patterns of interes...

Descripción completa

Detalles Bibliográficos
Autor principal: Mangalam, Harry J
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2002
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC99049/
https://www.ncbi.nlm.nih.gov/pubmed/11882250
http://dx.doi.org/10.1186/1471-2105-3-8
_version_ 1782120204271091712
author Mangalam, Harry J
author_facet Mangalam, Harry J
author_sort Mangalam, Harry J
collection PubMed
description BACKGROUND: Pattern matching is the core of bioinformatics; it is used in database searching, restriction enzyme mapping, and finding open reading frames. It is done repeatedly over increasingly long sequences, thus codes must be efficient and insensitive to sequence length. Such patterns of interest include simple motifs with IUPAC degeneracies, regular expressions, patterns allowing mismatches, and probability matrices. RESULTS: I describe a small application which allows searching for all the above pattern types individually, which further allows these atomic motifs to be assembled into logical rules for more sophisticated analysis. CONCLUSION: tacg is small, portable, faster and more capable than most alternatives, relatively easy to modify, and freely available in source code.
format Text
id pubmed-99049
institution National Center for Biotechnology Information
language English
publishDate 2002
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-990492002-03-22 tacg – a grep for DNA Mangalam, Harry J BMC Bioinformatics Methodology article BACKGROUND: Pattern matching is the core of bioinformatics; it is used in database searching, restriction enzyme mapping, and finding open reading frames. It is done repeatedly over increasingly long sequences, thus codes must be efficient and insensitive to sequence length. Such patterns of interest include simple motifs with IUPAC degeneracies, regular expressions, patterns allowing mismatches, and probability matrices. RESULTS: I describe a small application which allows searching for all the above pattern types individually, which further allows these atomic motifs to be assembled into logical rules for more sophisticated analysis. CONCLUSION: tacg is small, portable, faster and more capable than most alternatives, relatively easy to modify, and freely available in source code. BioMed Central 2002-03-06 /pmc/articles/PMC99049/ /pubmed/11882250 http://dx.doi.org/10.1186/1471-2105-3-8 Text en Copyright ©2002 Mangalam; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.
spellingShingle Methodology article
Mangalam, Harry J
tacg – a grep for DNA
title tacg – a grep for DNA
title_full tacg – a grep for DNA
title_fullStr tacg – a grep for DNA
title_full_unstemmed tacg – a grep for DNA
title_short tacg – a grep for DNA
title_sort tacg – a grep for dna
topic Methodology article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC99049/
https://www.ncbi.nlm.nih.gov/pubmed/11882250
http://dx.doi.org/10.1186/1471-2105-3-8
work_keys_str_mv AT mangalamharryj tacgagrepfordna