Cargando…

Protein comparison at the domain architecture level

BACKGROUND: The general method used to determine the function of newly discovered proteins is to transfer annotations from well-characterized homologous proteins. The process of selecting homologous proteins can largely be classified into sequence-based and domain-based approaches. Domain-based meth...

Descripción completa

Detalles Bibliográficos
Autores principales: Lee, Byungwook, Lee, Doheon
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2788356/
https://www.ncbi.nlm.nih.gov/pubmed/19958515
http://dx.doi.org/10.1186/1471-2105-10-S15-S5
_version_ 1782174962673516544
author Lee, Byungwook
Lee, Doheon
author_facet Lee, Byungwook
Lee, Doheon
author_sort Lee, Byungwook
collection PubMed
description BACKGROUND: The general method used to determine the function of newly discovered proteins is to transfer annotations from well-characterized homologous proteins. The process of selecting homologous proteins can largely be classified into sequence-based and domain-based approaches. Domain-based methods have several advantages for identifying distant homology and homology among proteins with multiple domains, as compared to sequence-based methods. However, these methods are challenged by large families defined by 'promiscuous' (or 'mobile') domains. RESULTS: Here we present a measure, called Weighed Domain Architecture Comparison (WDAC), of domain architecture similarity, which can be used to identify homolog of multidomain proteins. To distinguish these promiscuous domains from conventional protein domains, we assigned a weight score to Pfam domain extracted from RefSeq proteins, based on its abundance and versatility. To measure the similarity of two domain architectures, cosine similarity (a similarity measure used in information retrieval) is used. We combined sequence similarity with domain architecture comparisons to identify proteins belonging to the same domain architecture. Using human and nematode proteomes, we compared WDAC with an unweighted domain architecture method (DAC) to evaluate the effectiveness of domain weight scores. We found that WDAC is better at identifying homology among multidomain proteins. CONCLUSION: Our analysis indicates that considering domain weight scores in domain architecture comparisons improves protein homology identification. We developed a web-based server to allow users to compare their proteins with protein domain architectures.
format Text
id pubmed-2788356
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-27883562009-12-04 Protein comparison at the domain architecture level Lee, Byungwook Lee, Doheon BMC Bioinformatics Proceedings BACKGROUND: The general method used to determine the function of newly discovered proteins is to transfer annotations from well-characterized homologous proteins. The process of selecting homologous proteins can largely be classified into sequence-based and domain-based approaches. Domain-based methods have several advantages for identifying distant homology and homology among proteins with multiple domains, as compared to sequence-based methods. However, these methods are challenged by large families defined by 'promiscuous' (or 'mobile') domains. RESULTS: Here we present a measure, called Weighed Domain Architecture Comparison (WDAC), of domain architecture similarity, which can be used to identify homolog of multidomain proteins. To distinguish these promiscuous domains from conventional protein domains, we assigned a weight score to Pfam domain extracted from RefSeq proteins, based on its abundance and versatility. To measure the similarity of two domain architectures, cosine similarity (a similarity measure used in information retrieval) is used. We combined sequence similarity with domain architecture comparisons to identify proteins belonging to the same domain architecture. Using human and nematode proteomes, we compared WDAC with an unweighted domain architecture method (DAC) to evaluate the effectiveness of domain weight scores. We found that WDAC is better at identifying homology among multidomain proteins. CONCLUSION: Our analysis indicates that considering domain weight scores in domain architecture comparisons improves protein homology identification. We developed a web-based server to allow users to compare their proteins with protein domain architectures. BioMed Central 2009-12-03 /pmc/articles/PMC2788356/ /pubmed/19958515 http://dx.doi.org/10.1186/1471-2105-10-S15-S5 Text en Copyright ©2009 Lee and Lee; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Lee, Byungwook
Lee, Doheon
Protein comparison at the domain architecture level
title Protein comparison at the domain architecture level
title_full Protein comparison at the domain architecture level
title_fullStr Protein comparison at the domain architecture level
title_full_unstemmed Protein comparison at the domain architecture level
title_short Protein comparison at the domain architecture level
title_sort protein comparison at the domain architecture level
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2788356/
https://www.ncbi.nlm.nih.gov/pubmed/19958515
http://dx.doi.org/10.1186/1471-2105-10-S15-S5
work_keys_str_mv AT leebyungwook proteincomparisonatthedomainarchitecturelevel
AT leedoheon proteincomparisonatthedomainarchitecturelevel