Cargando…

HaMStR: Profile hidden markov model based search for orthologs in ESTs

BACKGROUND: EST sequencing is a versatile approach for rapidly gathering protein coding sequences. They provide direct access to an organism's gene repertoire bypassing the still error-prone procedure of gene prediction from genomic data. Therefore, ESTs are often the only source for biological...

Descripción completa

Detalles Bibliográficos
Autores principales: Ebersberger, Ingo, Strauss, Sascha, von Haeseler, Arndt
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2723089/
https://www.ncbi.nlm.nih.gov/pubmed/19586527
http://dx.doi.org/10.1186/1471-2148-9-157
_version_ 1782170352615424000
author Ebersberger, Ingo
Strauss, Sascha
von Haeseler, Arndt
author_facet Ebersberger, Ingo
Strauss, Sascha
von Haeseler, Arndt
author_sort Ebersberger, Ingo
collection PubMed
description BACKGROUND: EST sequencing is a versatile approach for rapidly gathering protein coding sequences. They provide direct access to an organism's gene repertoire bypassing the still error-prone procedure of gene prediction from genomic data. Therefore, ESTs are often the only source for biological sequence data from taxa outside mainstream interest. The widespread use of ESTs in evolutionary studies and particularly in molecular systematics studies is still hindered by the lack of efficient and reliable approaches for automated ortholog predictions in ESTs. Existing methods either depend on a known species tree or cannot cope with redundancy in EST data. RESULTS: We present a novel approach (HaMStR) to mine EST data for the presence of orthologs to a curated set of genes. HaMStR combines a profile Hidden Markov Model search and a subsequent BLAST search to extend existing ortholog cluster with sequences from further taxa. We show that the HaMStR results are consistent with those obtained with existing orthology prediction methods that require completely sequenced genomes. A case study on the phylogeny of 35 fungal taxa illustrates that HaMStR is well suited to compile informative data sets for phylogenomic studies from ESTs and protein sequence data. CONCLUSION: HaMStR extends in a standardized manner a pre-defined set of orthologs with ESTs from further taxa. In the same fashion HaMStR can be applied to protein sequence data, and thus provides a comprehensive approach to compile ortholog cluster from any protein coding data. The resulting orthology predictions serve as the data basis for a variety of evolutionary studies. Here, we have demonstrated the application of HaMStR in a molecular systematics study. However, we envision that studies tracing the evolutionary fate of individual genes or functional complexes of genes will greatly benefit from HaMStR orthology predictions as well.
format Text
id pubmed-2723089
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-27230892009-08-08 HaMStR: Profile hidden markov model based search for orthologs in ESTs Ebersberger, Ingo Strauss, Sascha von Haeseler, Arndt BMC Evol Biol Methodology Article BACKGROUND: EST sequencing is a versatile approach for rapidly gathering protein coding sequences. They provide direct access to an organism's gene repertoire bypassing the still error-prone procedure of gene prediction from genomic data. Therefore, ESTs are often the only source for biological sequence data from taxa outside mainstream interest. The widespread use of ESTs in evolutionary studies and particularly in molecular systematics studies is still hindered by the lack of efficient and reliable approaches for automated ortholog predictions in ESTs. Existing methods either depend on a known species tree or cannot cope with redundancy in EST data. RESULTS: We present a novel approach (HaMStR) to mine EST data for the presence of orthologs to a curated set of genes. HaMStR combines a profile Hidden Markov Model search and a subsequent BLAST search to extend existing ortholog cluster with sequences from further taxa. We show that the HaMStR results are consistent with those obtained with existing orthology prediction methods that require completely sequenced genomes. A case study on the phylogeny of 35 fungal taxa illustrates that HaMStR is well suited to compile informative data sets for phylogenomic studies from ESTs and protein sequence data. CONCLUSION: HaMStR extends in a standardized manner a pre-defined set of orthologs with ESTs from further taxa. In the same fashion HaMStR can be applied to protein sequence data, and thus provides a comprehensive approach to compile ortholog cluster from any protein coding data. The resulting orthology predictions serve as the data basis for a variety of evolutionary studies. Here, we have demonstrated the application of HaMStR in a molecular systematics study. However, we envision that studies tracing the evolutionary fate of individual genes or functional complexes of genes will greatly benefit from HaMStR orthology predictions as well. BioMed Central 2009-07-08 /pmc/articles/PMC2723089/ /pubmed/19586527 http://dx.doi.org/10.1186/1471-2148-9-157 Text en Copyright © 2009 Ebersberger et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Ebersberger, Ingo
Strauss, Sascha
von Haeseler, Arndt
HaMStR: Profile hidden markov model based search for orthologs in ESTs
title HaMStR: Profile hidden markov model based search for orthologs in ESTs
title_full HaMStR: Profile hidden markov model based search for orthologs in ESTs
title_fullStr HaMStR: Profile hidden markov model based search for orthologs in ESTs
title_full_unstemmed HaMStR: Profile hidden markov model based search for orthologs in ESTs
title_short HaMStR: Profile hidden markov model based search for orthologs in ESTs
title_sort hamstr: profile hidden markov model based search for orthologs in ests
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2723089/
https://www.ncbi.nlm.nih.gov/pubmed/19586527
http://dx.doi.org/10.1186/1471-2148-9-157
work_keys_str_mv AT ebersbergeringo hamstrprofilehiddenmarkovmodelbasedsearchfororthologsinests
AT strausssascha hamstrprofilehiddenmarkovmodelbasedsearchfororthologsinests
AT vonhaeselerarndt hamstrprofilehiddenmarkovmodelbasedsearchfororthologsinests