Cargando…

HGTector: an automated method facilitating genome-wide discovery of putative horizontal gene transfers

BACKGROUND: First pass methods based on BLAST match are commonly used as an initial step to separate the different phylogenetic histories of genes in microbial genomes, and target putative horizontal gene transfer (HGT) events. This will continue to be necessary given the rapid growth of genomic dat...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhu, Qiyun, Kosoy, Michael, Dittmar, Katharina
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4155097/
https://www.ncbi.nlm.nih.gov/pubmed/25159222
http://dx.doi.org/10.1186/1471-2164-15-717
_version_ 1782333522573262848
author Zhu, Qiyun
Kosoy, Michael
Dittmar, Katharina
author_facet Zhu, Qiyun
Kosoy, Michael
Dittmar, Katharina
author_sort Zhu, Qiyun
collection PubMed
description BACKGROUND: First pass methods based on BLAST match are commonly used as an initial step to separate the different phylogenetic histories of genes in microbial genomes, and target putative horizontal gene transfer (HGT) events. This will continue to be necessary given the rapid growth of genomic data and the technical difficulties in conducting large-scale explicit phylogenetic analyses. However, these methods often produce misleading results due to their inability to resolve indirect phylogenetic links and their vulnerability to stochastic events. RESULTS: A new computational method of rapid, exhaustive and genome-wide detection of HGT was developed, featuring the systematic analysis of BLAST hit distribution patterns in the context of a priori defined hierarchical evolutionary categories. Genes that fall beyond a series of statistically determined thresholds are identified as not adhering to the typical vertical history of the organisms in question, but instead having a putative horizontal origin. Tests on simulated genomic data suggest that this approach effectively targets atypically distributed genes that are highly likely to be HGT-derived, and exhibits robust performance compared to conventional BLAST-based approaches. This method was further tested on real genomic datasets, including Rickettsia genomes, and was compared to previous studies. Results show consistency with currently employed categories of HGT prediction methods. In-depth analysis of both simulated and real genomic data suggests that the method is notably insensitive to stochastic events such as gene loss, rate variation and database error, which are common challenges to the current methodology. An automated pipeline was created to implement this approach and was made publicly available at: https://github.com/DittmarLab/HGTector. The program is versatile, easily deployed, has a low requirement for computational resources. CONCLUSIONS: HGTector is an effective tool for initial or standalone large-scale discovery of candidate HGT-derived genes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1471-2164-15-717) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4155097
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-41550972014-09-18 HGTector: an automated method facilitating genome-wide discovery of putative horizontal gene transfers Zhu, Qiyun Kosoy, Michael Dittmar, Katharina BMC Genomics Methodology Article BACKGROUND: First pass methods based on BLAST match are commonly used as an initial step to separate the different phylogenetic histories of genes in microbial genomes, and target putative horizontal gene transfer (HGT) events. This will continue to be necessary given the rapid growth of genomic data and the technical difficulties in conducting large-scale explicit phylogenetic analyses. However, these methods often produce misleading results due to their inability to resolve indirect phylogenetic links and their vulnerability to stochastic events. RESULTS: A new computational method of rapid, exhaustive and genome-wide detection of HGT was developed, featuring the systematic analysis of BLAST hit distribution patterns in the context of a priori defined hierarchical evolutionary categories. Genes that fall beyond a series of statistically determined thresholds are identified as not adhering to the typical vertical history of the organisms in question, but instead having a putative horizontal origin. Tests on simulated genomic data suggest that this approach effectively targets atypically distributed genes that are highly likely to be HGT-derived, and exhibits robust performance compared to conventional BLAST-based approaches. This method was further tested on real genomic datasets, including Rickettsia genomes, and was compared to previous studies. Results show consistency with currently employed categories of HGT prediction methods. In-depth analysis of both simulated and real genomic data suggests that the method is notably insensitive to stochastic events such as gene loss, rate variation and database error, which are common challenges to the current methodology. An automated pipeline was created to implement this approach and was made publicly available at: https://github.com/DittmarLab/HGTector. The program is versatile, easily deployed, has a low requirement for computational resources. CONCLUSIONS: HGTector is an effective tool for initial or standalone large-scale discovery of candidate HGT-derived genes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1471-2164-15-717) contains supplementary material, which is available to authorized users. BioMed Central 2014-08-26 /pmc/articles/PMC4155097/ /pubmed/25159222 http://dx.doi.org/10.1186/1471-2164-15-717 Text en © Zhu et al.; licensee BioMed Central Ltd. 2014 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Zhu, Qiyun
Kosoy, Michael
Dittmar, Katharina
HGTector: an automated method facilitating genome-wide discovery of putative horizontal gene transfers
title HGTector: an automated method facilitating genome-wide discovery of putative horizontal gene transfers
title_full HGTector: an automated method facilitating genome-wide discovery of putative horizontal gene transfers
title_fullStr HGTector: an automated method facilitating genome-wide discovery of putative horizontal gene transfers
title_full_unstemmed HGTector: an automated method facilitating genome-wide discovery of putative horizontal gene transfers
title_short HGTector: an automated method facilitating genome-wide discovery of putative horizontal gene transfers
title_sort hgtector: an automated method facilitating genome-wide discovery of putative horizontal gene transfers
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4155097/
https://www.ncbi.nlm.nih.gov/pubmed/25159222
http://dx.doi.org/10.1186/1471-2164-15-717
work_keys_str_mv AT zhuqiyun hgtectoranautomatedmethodfacilitatinggenomewidediscoveryofputativehorizontalgenetransfers
AT kosoymichael hgtectoranautomatedmethodfacilitatinggenomewidediscoveryofputativehorizontalgenetransfers
AT dittmarkatharina hgtectoranautomatedmethodfacilitatinggenomewidediscoveryofputativehorizontalgenetransfers