Cargando…
Towards Alignment Independent Quantitative Assessment of Homology Detection
Identification of homologous proteins provides a basis for protein annotation. Sequence alignment tools reliably identify homologs sharing high sequence similarity. However, identification of homologs that share low sequence similarity remains a challenge. Lowering the cutoff value could enable the...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2006
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1762415/ https://www.ncbi.nlm.nih.gov/pubmed/17205117 http://dx.doi.org/10.1371/journal.pone.0000113 |
Sumario: | Identification of homologous proteins provides a basis for protein annotation. Sequence alignment tools reliably identify homologs sharing high sequence similarity. However, identification of homologs that share low sequence similarity remains a challenge. Lowering the cutoff value could enable the identification of diverged homologs, but also introduces numerous false hits. Methods are being continuously developed to minimize this problem. Estimation of the fraction of homologs in a set of protein alignments can help in the assessment and development of such methods, and provides the users with intuitive quantitative assessment of protein alignment results. Herein, we present a computational approach that estimates the amount of homologs in a set of protein pairs. The method requires a prevalent and detectable protein feature that is conserved between homologs. By analyzing the feature prevalence in a set of pairwise protein alignments, the method can estimate the number of homolog pairs in the set independently of the alignments' quality. Using the HomoloGene database as a standard of truth, we implemented this approach in a proteome-wide analysis. The results revealed that this approach, which is independent of the alignments themselves, works well for estimating the number of homologous proteins in a wide range of homology values. In summary, the presented method can accompany homology searches and method development, provides validation to search results, and allows tuning of tools and methods. |
---|