Cargando…

Prokaryotic Phylogenies Inferred from Whole-Genome Sequence and Annotation Data

Phylogenetic trees are used to represent the evolutionary relationship among various groups of species. In this paper, a novel method for inferring prokaryotic phylogenies using multiple genomic information is proposed. The method is called CGCPhy and based on the distance matrix of orthologous gene...

Descripción completa

Detalles Bibliográficos
Autores principales: Du, Wei, Cao, Zhongbo, Wang, Yan, Sun, Ying, Blanzieri, Enrico, Liang, Yanchun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi Publishing Corporation 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3773407/
https://www.ncbi.nlm.nih.gov/pubmed/24073404
http://dx.doi.org/10.1155/2013/409062
_version_ 1782284412348530688
author Du, Wei
Cao, Zhongbo
Wang, Yan
Sun, Ying
Blanzieri, Enrico
Liang, Yanchun
author_facet Du, Wei
Cao, Zhongbo
Wang, Yan
Sun, Ying
Blanzieri, Enrico
Liang, Yanchun
author_sort Du, Wei
collection PubMed
description Phylogenetic trees are used to represent the evolutionary relationship among various groups of species. In this paper, a novel method for inferring prokaryotic phylogenies using multiple genomic information is proposed. The method is called CGCPhy and based on the distance matrix of orthologous gene clusters between whole-genome pairs. CGCPhy comprises four main steps. First, orthologous genes are determined by sequence similarity, genomic function, and genomic structure information. Second, genes involving potential HGT events are eliminated, since such genes are considered to be the highly conserved genes across different species and the genes located on fragments with abnormal genome barcode. Third, we calculate the distance of the orthologous gene clusters between each genome pair in terms of the number of orthologous genes in conserved clusters. Finally, the neighbor-joining method is employed to construct phylogenetic trees across different species. CGCPhy has been examined on different datasets from 617 complete single-chromosome prokaryotic genomes and achieved applicative accuracies on different species sets in agreement with Bergey's taxonomy in quartet topologies. Simulation results show that CGCPhy achieves high average accuracy and has a low standard deviation on different datasets, so it has an applicative potential for phylogenetic analysis.
format Online
Article
Text
id pubmed-3773407
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Hindawi Publishing Corporation
record_format MEDLINE/PubMed
spelling pubmed-37734072013-09-26 Prokaryotic Phylogenies Inferred from Whole-Genome Sequence and Annotation Data Du, Wei Cao, Zhongbo Wang, Yan Sun, Ying Blanzieri, Enrico Liang, Yanchun Biomed Res Int Research Article Phylogenetic trees are used to represent the evolutionary relationship among various groups of species. In this paper, a novel method for inferring prokaryotic phylogenies using multiple genomic information is proposed. The method is called CGCPhy and based on the distance matrix of orthologous gene clusters between whole-genome pairs. CGCPhy comprises four main steps. First, orthologous genes are determined by sequence similarity, genomic function, and genomic structure information. Second, genes involving potential HGT events are eliminated, since such genes are considered to be the highly conserved genes across different species and the genes located on fragments with abnormal genome barcode. Third, we calculate the distance of the orthologous gene clusters between each genome pair in terms of the number of orthologous genes in conserved clusters. Finally, the neighbor-joining method is employed to construct phylogenetic trees across different species. CGCPhy has been examined on different datasets from 617 complete single-chromosome prokaryotic genomes and achieved applicative accuracies on different species sets in agreement with Bergey's taxonomy in quartet topologies. Simulation results show that CGCPhy achieves high average accuracy and has a low standard deviation on different datasets, so it has an applicative potential for phylogenetic analysis. Hindawi Publishing Corporation 2013 2013-08-29 /pmc/articles/PMC3773407/ /pubmed/24073404 http://dx.doi.org/10.1155/2013/409062 Text en Copyright © 2013 Wei Du et al. https://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Du, Wei
Cao, Zhongbo
Wang, Yan
Sun, Ying
Blanzieri, Enrico
Liang, Yanchun
Prokaryotic Phylogenies Inferred from Whole-Genome Sequence and Annotation Data
title Prokaryotic Phylogenies Inferred from Whole-Genome Sequence and Annotation Data
title_full Prokaryotic Phylogenies Inferred from Whole-Genome Sequence and Annotation Data
title_fullStr Prokaryotic Phylogenies Inferred from Whole-Genome Sequence and Annotation Data
title_full_unstemmed Prokaryotic Phylogenies Inferred from Whole-Genome Sequence and Annotation Data
title_short Prokaryotic Phylogenies Inferred from Whole-Genome Sequence and Annotation Data
title_sort prokaryotic phylogenies inferred from whole-genome sequence and annotation data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3773407/
https://www.ncbi.nlm.nih.gov/pubmed/24073404
http://dx.doi.org/10.1155/2013/409062
work_keys_str_mv AT duwei prokaryoticphylogeniesinferredfromwholegenomesequenceandannotationdata
AT caozhongbo prokaryoticphylogeniesinferredfromwholegenomesequenceandannotationdata
AT wangyan prokaryoticphylogeniesinferredfromwholegenomesequenceandannotationdata
AT sunying prokaryoticphylogeniesinferredfromwholegenomesequenceandannotationdata
AT blanzierienrico prokaryoticphylogeniesinferredfromwholegenomesequenceandannotationdata
AT liangyanchun prokaryoticphylogeniesinferredfromwholegenomesequenceandannotationdata