Cargando…
HitKeeper, a generic software package for hit list management
BACKGROUND: The automated annotation of biological sequences (protein, DNA) relies on the computation of hits (predicted features) on the sequences using various algorithms. Public databases of biological sequences provide a wealth of biological "knowledge", for example manually validated...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2007
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1852800/ https://www.ncbi.nlm.nih.gov/pubmed/17391514 http://dx.doi.org/10.1186/1751-0473-2-2 |
_version_ | 1782133085243965440 |
---|---|
author | Hau, Jörg Muller, Michael Pagni, Marco |
author_facet | Hau, Jörg Muller, Michael Pagni, Marco |
author_sort | Hau, Jörg |
collection | PubMed |
description | BACKGROUND: The automated annotation of biological sequences (protein, DNA) relies on the computation of hits (predicted features) on the sequences using various algorithms. Public databases of biological sequences provide a wealth of biological "knowledge", for example manually validated annotations (features) that are located on the sequences, but mining the sequence annotations and especially the predicted and curated features requires dedicated tools. Due to the heterogeneity and diversity of the biological information, it is difficult to handle redundancy, frequent updates, taxonomic information and "private" data together with computational algorithms in a common workflow. RESULTS: We present HitKeeper, a software package that controls the fully automatic handling of multiple biological databases and of hit list calculations on a large scale. The software implements an asynchronous update system that introduces updates and computes hits as soon as new data become available. A query interface enables the user to search sequences by specifying constraints, such as retrieving sequences that contain specific motifs, or a defined arrangement of motifs ("metamotifs"), or filtering based on the taxonomic classification of a sequence. CONCLUSION: The software provides a generic and modular framework to handle the redundancy and incremental updates of biological databases, and an original query language. It is published under the terms and conditions of version 2 of the GNU Public License and available at . |
format | Text |
id | pubmed-1852800 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2007 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-18528002007-04-19 HitKeeper, a generic software package for hit list management Hau, Jörg Muller, Michael Pagni, Marco Source Code Biol Med Software Review BACKGROUND: The automated annotation of biological sequences (protein, DNA) relies on the computation of hits (predicted features) on the sequences using various algorithms. Public databases of biological sequences provide a wealth of biological "knowledge", for example manually validated annotations (features) that are located on the sequences, but mining the sequence annotations and especially the predicted and curated features requires dedicated tools. Due to the heterogeneity and diversity of the biological information, it is difficult to handle redundancy, frequent updates, taxonomic information and "private" data together with computational algorithms in a common workflow. RESULTS: We present HitKeeper, a software package that controls the fully automatic handling of multiple biological databases and of hit list calculations on a large scale. The software implements an asynchronous update system that introduces updates and computes hits as soon as new data become available. A query interface enables the user to search sequences by specifying constraints, such as retrieving sequences that contain specific motifs, or a defined arrangement of motifs ("metamotifs"), or filtering based on the taxonomic classification of a sequence. CONCLUSION: The software provides a generic and modular framework to handle the redundancy and incremental updates of biological databases, and an original query language. It is published under the terms and conditions of version 2 of the GNU Public License and available at . BioMed Central 2007-03-28 /pmc/articles/PMC1852800/ /pubmed/17391514 http://dx.doi.org/10.1186/1751-0473-2-2 Text en Copyright © 2007 Hau et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Software Review Hau, Jörg Muller, Michael Pagni, Marco HitKeeper, a generic software package for hit list management |
title | HitKeeper, a generic software package for hit list management |
title_full | HitKeeper, a generic software package for hit list management |
title_fullStr | HitKeeper, a generic software package for hit list management |
title_full_unstemmed | HitKeeper, a generic software package for hit list management |
title_short | HitKeeper, a generic software package for hit list management |
title_sort | hitkeeper, a generic software package for hit list management |
topic | Software Review |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1852800/ https://www.ncbi.nlm.nih.gov/pubmed/17391514 http://dx.doi.org/10.1186/1751-0473-2-2 |
work_keys_str_mv | AT haujorg hitkeeperagenericsoftwarepackageforhitlistmanagement AT mullermichael hitkeeperagenericsoftwarepackageforhitlistmanagement AT pagnimarco hitkeeperagenericsoftwarepackageforhitlistmanagement |