Cargando…

SIMAP: the similarity matrix of proteins

Similarity Matrix of Proteins (SIMAP) () provides a database based on a pre-computed similarity matrix covering the similarity space formed by >4 million amino acid sequences from public databases and completely sequenced genomes. The database is capable of handling very large datasets and is upd...

Descripción completa

Detalles Bibliográficos
Autores principales: Rattei, Thomas, Arnold, Roland, Tischler, Patrick, Lindner, Dominik, Stümpflen, Volker, Mewes, H. Werner
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1347468/
https://www.ncbi.nlm.nih.gov/pubmed/16381858
http://dx.doi.org/10.1093/nar/gkj106
_version_ 1782126625764147200
author Rattei, Thomas
Arnold, Roland
Tischler, Patrick
Lindner, Dominik
Stümpflen, Volker
Mewes, H. Werner
author_facet Rattei, Thomas
Arnold, Roland
Tischler, Patrick
Lindner, Dominik
Stümpflen, Volker
Mewes, H. Werner
author_sort Rattei, Thomas
collection PubMed
description Similarity Matrix of Proteins (SIMAP) () provides a database based on a pre-computed similarity matrix covering the similarity space formed by >4 million amino acid sequences from public databases and completely sequenced genomes. The database is capable of handling very large datasets and is updated incrementally. For sequence similarity searches and pairwise alignments, we implemented a grid-enabled software system, which is based on FASTA heuristics and the Smith–Waterman algorithm. Our ProtInfo system allows querying by protein sequences covered by the SIMAP dataset as well as by fragments of these sequences, highly similar sequences and title words. Each sequence in the database is supplemented with pre-calculated features generated by detailed sequence analyses. By providing WWW interfaces as well as web-services, we offer the SIMAP resource as an efficient and comprehensive tool for sequence similarity searches.
format Text
id pubmed-1347468
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-13474682006-01-25 SIMAP: the similarity matrix of proteins Rattei, Thomas Arnold, Roland Tischler, Patrick Lindner, Dominik Stümpflen, Volker Mewes, H. Werner Nucleic Acids Res Article Similarity Matrix of Proteins (SIMAP) () provides a database based on a pre-computed similarity matrix covering the similarity space formed by >4 million amino acid sequences from public databases and completely sequenced genomes. The database is capable of handling very large datasets and is updated incrementally. For sequence similarity searches and pairwise alignments, we implemented a grid-enabled software system, which is based on FASTA heuristics and the Smith–Waterman algorithm. Our ProtInfo system allows querying by protein sequences covered by the SIMAP dataset as well as by fragments of these sequences, highly similar sequences and title words. Each sequence in the database is supplemented with pre-calculated features generated by detailed sequence analyses. By providing WWW interfaces as well as web-services, we offer the SIMAP resource as an efficient and comprehensive tool for sequence similarity searches. Oxford University Press 2006-01-01 2005-12-28 /pmc/articles/PMC1347468/ /pubmed/16381858 http://dx.doi.org/10.1093/nar/gkj106 Text en © The Author 2006. Published by Oxford University Press. All rights reserved
spellingShingle Article
Rattei, Thomas
Arnold, Roland
Tischler, Patrick
Lindner, Dominik
Stümpflen, Volker
Mewes, H. Werner
SIMAP: the similarity matrix of proteins
title SIMAP: the similarity matrix of proteins
title_full SIMAP: the similarity matrix of proteins
title_fullStr SIMAP: the similarity matrix of proteins
title_full_unstemmed SIMAP: the similarity matrix of proteins
title_short SIMAP: the similarity matrix of proteins
title_sort simap: the similarity matrix of proteins
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1347468/
https://www.ncbi.nlm.nih.gov/pubmed/16381858
http://dx.doi.org/10.1093/nar/gkj106
work_keys_str_mv AT ratteithomas simapthesimilaritymatrixofproteins
AT arnoldroland simapthesimilaritymatrixofproteins
AT tischlerpatrick simapthesimilaritymatrixofproteins
AT lindnerdominik simapthesimilaritymatrixofproteins
AT stumpflenvolker simapthesimilaritymatrixofproteins
AT meweshwerner simapthesimilaritymatrixofproteins