Cargando…

Analysis of superfamily specific profile-profile recognition accuracy

BACKGROUND: Annotation of sequences that share little similarity to sequences of known function remains a major obstacle in genome annotation. Some of the best methods of detecting remote relationships between protein sequences are based on matching sequence profiles. We analyse the superfamily spec...

Descripción completa

Detalles Bibliográficos
Autores principales: Casbon, James A, Saqi, Mansoor AS
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2004
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC543460/
https://www.ncbi.nlm.nih.gov/pubmed/15603591
http://dx.doi.org/10.1186/1471-2105-5-200
_version_ 1782122126974648320
author Casbon, James A
Saqi, Mansoor AS
author_facet Casbon, James A
Saqi, Mansoor AS
author_sort Casbon, James A
collection PubMed
description BACKGROUND: Annotation of sequences that share little similarity to sequences of known function remains a major obstacle in genome annotation. Some of the best methods of detecting remote relationships between protein sequences are based on matching sequence profiles. We analyse the superfamily specific performance of sequence profile-profile matching. Our benchmark consists of a set of 16 protein superfamilies that are highly diverse at the sequence level. We relate the performance to the number of sequences in the profiles, the profile diversity and the extent of structural conservation in the superfamily. RESULTS: The performance varies greatly between superfamilies with the truncated receiver operating characteristic, ROC(10), varying from 0.95 down to 0.01. These large differences persist even when the profiles are trimmed to approximately the same level of diversity. CONCLUSIONS: Although the number of sequences in the profile (profile width) and degree of sequence variation within positions in the profile (profile diversity) contribute to accurate detection there are other superfamily specific factors.
format Text
id pubmed-543460
institution National Center for Biotechnology Information
language English
publishDate 2004
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-5434602005-01-07 Analysis of superfamily specific profile-profile recognition accuracy Casbon, James A Saqi, Mansoor AS BMC Bioinformatics Research Article BACKGROUND: Annotation of sequences that share little similarity to sequences of known function remains a major obstacle in genome annotation. Some of the best methods of detecting remote relationships between protein sequences are based on matching sequence profiles. We analyse the superfamily specific performance of sequence profile-profile matching. Our benchmark consists of a set of 16 protein superfamilies that are highly diverse at the sequence level. We relate the performance to the number of sequences in the profiles, the profile diversity and the extent of structural conservation in the superfamily. RESULTS: The performance varies greatly between superfamilies with the truncated receiver operating characteristic, ROC(10), varying from 0.95 down to 0.01. These large differences persist even when the profiles are trimmed to approximately the same level of diversity. CONCLUSIONS: Although the number of sequences in the profile (profile width) and degree of sequence variation within positions in the profile (profile diversity) contribute to accurate detection there are other superfamily specific factors. BioMed Central 2004-12-16 /pmc/articles/PMC543460/ /pubmed/15603591 http://dx.doi.org/10.1186/1471-2105-5-200 Text en Copyright © 2004 Casbon and Saqi; licensee BioMed Central Ltd.
spellingShingle Research Article
Casbon, James A
Saqi, Mansoor AS
Analysis of superfamily specific profile-profile recognition accuracy
title Analysis of superfamily specific profile-profile recognition accuracy
title_full Analysis of superfamily specific profile-profile recognition accuracy
title_fullStr Analysis of superfamily specific profile-profile recognition accuracy
title_full_unstemmed Analysis of superfamily specific profile-profile recognition accuracy
title_short Analysis of superfamily specific profile-profile recognition accuracy
title_sort analysis of superfamily specific profile-profile recognition accuracy
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC543460/
https://www.ncbi.nlm.nih.gov/pubmed/15603591
http://dx.doi.org/10.1186/1471-2105-5-200
work_keys_str_mv AT casbonjamesa analysisofsuperfamilyspecificprofileprofilerecognitionaccuracy
AT saqimansooras analysisofsuperfamilyspecificprofileprofilerecognitionaccuracy