Cargando…

HitKeeper, a generic software package for hit list management

BACKGROUND: The automated annotation of biological sequences (protein, DNA) relies on the computation of hits (predicted features) on the sequences using various algorithms. Public databases of biological sequences provide a wealth of biological "knowledge", for example manually validated...

Descripción completa

Detalles Bibliográficos
Autores principales: Hau, Jörg, Muller, Michael, Pagni, Marco
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1852800/
https://www.ncbi.nlm.nih.gov/pubmed/17391514
http://dx.doi.org/10.1186/1751-0473-2-2
_version_ 1782133085243965440
author Hau, Jörg
Muller, Michael
Pagni, Marco
author_facet Hau, Jörg
Muller, Michael
Pagni, Marco
author_sort Hau, Jörg
collection PubMed
description BACKGROUND: The automated annotation of biological sequences (protein, DNA) relies on the computation of hits (predicted features) on the sequences using various algorithms. Public databases of biological sequences provide a wealth of biological "knowledge", for example manually validated annotations (features) that are located on the sequences, but mining the sequence annotations and especially the predicted and curated features requires dedicated tools. Due to the heterogeneity and diversity of the biological information, it is difficult to handle redundancy, frequent updates, taxonomic information and "private" data together with computational algorithms in a common workflow. RESULTS: We present HitKeeper, a software package that controls the fully automatic handling of multiple biological databases and of hit list calculations on a large scale. The software implements an asynchronous update system that introduces updates and computes hits as soon as new data become available. A query interface enables the user to search sequences by specifying constraints, such as retrieving sequences that contain specific motifs, or a defined arrangement of motifs ("metamotifs"), or filtering based on the taxonomic classification of a sequence. CONCLUSION: The software provides a generic and modular framework to handle the redundancy and incremental updates of biological databases, and an original query language. It is published under the terms and conditions of version 2 of the GNU Public License and available at .
format Text
id pubmed-1852800
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-18528002007-04-19 HitKeeper, a generic software package for hit list management Hau, Jörg Muller, Michael Pagni, Marco Source Code Biol Med Software Review BACKGROUND: The automated annotation of biological sequences (protein, DNA) relies on the computation of hits (predicted features) on the sequences using various algorithms. Public databases of biological sequences provide a wealth of biological "knowledge", for example manually validated annotations (features) that are located on the sequences, but mining the sequence annotations and especially the predicted and curated features requires dedicated tools. Due to the heterogeneity and diversity of the biological information, it is difficult to handle redundancy, frequent updates, taxonomic information and "private" data together with computational algorithms in a common workflow. RESULTS: We present HitKeeper, a software package that controls the fully automatic handling of multiple biological databases and of hit list calculations on a large scale. The software implements an asynchronous update system that introduces updates and computes hits as soon as new data become available. A query interface enables the user to search sequences by specifying constraints, such as retrieving sequences that contain specific motifs, or a defined arrangement of motifs ("metamotifs"), or filtering based on the taxonomic classification of a sequence. CONCLUSION: The software provides a generic and modular framework to handle the redundancy and incremental updates of biological databases, and an original query language. It is published under the terms and conditions of version 2 of the GNU Public License and available at . BioMed Central 2007-03-28 /pmc/articles/PMC1852800/ /pubmed/17391514 http://dx.doi.org/10.1186/1751-0473-2-2 Text en Copyright © 2007 Hau et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software Review
Hau, Jörg
Muller, Michael
Pagni, Marco
HitKeeper, a generic software package for hit list management
title HitKeeper, a generic software package for hit list management
title_full HitKeeper, a generic software package for hit list management
title_fullStr HitKeeper, a generic software package for hit list management
title_full_unstemmed HitKeeper, a generic software package for hit list management
title_short HitKeeper, a generic software package for hit list management
title_sort hitkeeper, a generic software package for hit list management
topic Software Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1852800/
https://www.ncbi.nlm.nih.gov/pubmed/17391514
http://dx.doi.org/10.1186/1751-0473-2-2
work_keys_str_mv AT haujorg hitkeeperagenericsoftwarepackageforhitlistmanagement
AT mullermichael hitkeeperagenericsoftwarepackageforhitlistmanagement
AT pagnimarco hitkeeperagenericsoftwarepackageforhitlistmanagement