Cargando…

Towards Alignment Independent Quantitative Assessment of Homology Detection

Identification of homologous proteins provides a basis for protein annotation. Sequence alignment tools reliably identify homologs sharing high sequence similarity. However, identification of homologs that share low sequence similarity remains a challenge. Lowering the cutoff value could enable the...

Descripción completa

Detalles Bibliográficos
Autores principales: Apatoff, Avihay, Kim, Eddo, Kliger, Yossef
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1762415/
https://www.ncbi.nlm.nih.gov/pubmed/17205117
http://dx.doi.org/10.1371/journal.pone.0000113
_version_ 1782131566744436736
author Apatoff, Avihay
Kim, Eddo
Kliger, Yossef
author_facet Apatoff, Avihay
Kim, Eddo
Kliger, Yossef
author_sort Apatoff, Avihay
collection PubMed
description Identification of homologous proteins provides a basis for protein annotation. Sequence alignment tools reliably identify homologs sharing high sequence similarity. However, identification of homologs that share low sequence similarity remains a challenge. Lowering the cutoff value could enable the identification of diverged homologs, but also introduces numerous false hits. Methods are being continuously developed to minimize this problem. Estimation of the fraction of homologs in a set of protein alignments can help in the assessment and development of such methods, and provides the users with intuitive quantitative assessment of protein alignment results. Herein, we present a computational approach that estimates the amount of homologs in a set of protein pairs. The method requires a prevalent and detectable protein feature that is conserved between homologs. By analyzing the feature prevalence in a set of pairwise protein alignments, the method can estimate the number of homolog pairs in the set independently of the alignments' quality. Using the HomoloGene database as a standard of truth, we implemented this approach in a proteome-wide analysis. The results revealed that this approach, which is independent of the alignments themselves, works well for estimating the number of homologous proteins in a wide range of homology values. In summary, the presented method can accompany homology searches and method development, provides validation to search results, and allows tuning of tools and methods.
format Text
id pubmed-1762415
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-17624152007-02-07 Towards Alignment Independent Quantitative Assessment of Homology Detection Apatoff, Avihay Kim, Eddo Kliger, Yossef PLoS One Research Article Identification of homologous proteins provides a basis for protein annotation. Sequence alignment tools reliably identify homologs sharing high sequence similarity. However, identification of homologs that share low sequence similarity remains a challenge. Lowering the cutoff value could enable the identification of diverged homologs, but also introduces numerous false hits. Methods are being continuously developed to minimize this problem. Estimation of the fraction of homologs in a set of protein alignments can help in the assessment and development of such methods, and provides the users with intuitive quantitative assessment of protein alignment results. Herein, we present a computational approach that estimates the amount of homologs in a set of protein pairs. The method requires a prevalent and detectable protein feature that is conserved between homologs. By analyzing the feature prevalence in a set of pairwise protein alignments, the method can estimate the number of homolog pairs in the set independently of the alignments' quality. Using the HomoloGene database as a standard of truth, we implemented this approach in a proteome-wide analysis. The results revealed that this approach, which is independent of the alignments themselves, works well for estimating the number of homologous proteins in a wide range of homology values. In summary, the presented method can accompany homology searches and method development, provides validation to search results, and allows tuning of tools and methods. Public Library of Science 2006-12-27 /pmc/articles/PMC1762415/ /pubmed/17205117 http://dx.doi.org/10.1371/journal.pone.0000113 Text en Apatoff et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Apatoff, Avihay
Kim, Eddo
Kliger, Yossef
Towards Alignment Independent Quantitative Assessment of Homology Detection
title Towards Alignment Independent Quantitative Assessment of Homology Detection
title_full Towards Alignment Independent Quantitative Assessment of Homology Detection
title_fullStr Towards Alignment Independent Quantitative Assessment of Homology Detection
title_full_unstemmed Towards Alignment Independent Quantitative Assessment of Homology Detection
title_short Towards Alignment Independent Quantitative Assessment of Homology Detection
title_sort towards alignment independent quantitative assessment of homology detection
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1762415/
https://www.ncbi.nlm.nih.gov/pubmed/17205117
http://dx.doi.org/10.1371/journal.pone.0000113
work_keys_str_mv AT apatoffavihay towardsalignmentindependentquantitativeassessmentofhomologydetection
AT kimeddo towardsalignmentindependentquantitativeassessmentofhomologydetection
AT kligeryossef towardsalignmentindependentquantitativeassessmentofhomologydetection