Cargando…
Prokaryotic Phylogenies Inferred from Whole-Genome Sequence and Annotation Data
Phylogenetic trees are used to represent the evolutionary relationship among various groups of species. In this paper, a novel method for inferring prokaryotic phylogenies using multiple genomic information is proposed. The method is called CGCPhy and based on the distance matrix of orthologous gene...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi Publishing Corporation
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3773407/ https://www.ncbi.nlm.nih.gov/pubmed/24073404 http://dx.doi.org/10.1155/2013/409062 |
_version_ | 1782284412348530688 |
---|---|
author | Du, Wei Cao, Zhongbo Wang, Yan Sun, Ying Blanzieri, Enrico Liang, Yanchun |
author_facet | Du, Wei Cao, Zhongbo Wang, Yan Sun, Ying Blanzieri, Enrico Liang, Yanchun |
author_sort | Du, Wei |
collection | PubMed |
description | Phylogenetic trees are used to represent the evolutionary relationship among various groups of species. In this paper, a novel method for inferring prokaryotic phylogenies using multiple genomic information is proposed. The method is called CGCPhy and based on the distance matrix of orthologous gene clusters between whole-genome pairs. CGCPhy comprises four main steps. First, orthologous genes are determined by sequence similarity, genomic function, and genomic structure information. Second, genes involving potential HGT events are eliminated, since such genes are considered to be the highly conserved genes across different species and the genes located on fragments with abnormal genome barcode. Third, we calculate the distance of the orthologous gene clusters between each genome pair in terms of the number of orthologous genes in conserved clusters. Finally, the neighbor-joining method is employed to construct phylogenetic trees across different species. CGCPhy has been examined on different datasets from 617 complete single-chromosome prokaryotic genomes and achieved applicative accuracies on different species sets in agreement with Bergey's taxonomy in quartet topologies. Simulation results show that CGCPhy achieves high average accuracy and has a low standard deviation on different datasets, so it has an applicative potential for phylogenetic analysis. |
format | Online Article Text |
id | pubmed-3773407 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Hindawi Publishing Corporation |
record_format | MEDLINE/PubMed |
spelling | pubmed-37734072013-09-26 Prokaryotic Phylogenies Inferred from Whole-Genome Sequence and Annotation Data Du, Wei Cao, Zhongbo Wang, Yan Sun, Ying Blanzieri, Enrico Liang, Yanchun Biomed Res Int Research Article Phylogenetic trees are used to represent the evolutionary relationship among various groups of species. In this paper, a novel method for inferring prokaryotic phylogenies using multiple genomic information is proposed. The method is called CGCPhy and based on the distance matrix of orthologous gene clusters between whole-genome pairs. CGCPhy comprises four main steps. First, orthologous genes are determined by sequence similarity, genomic function, and genomic structure information. Second, genes involving potential HGT events are eliminated, since such genes are considered to be the highly conserved genes across different species and the genes located on fragments with abnormal genome barcode. Third, we calculate the distance of the orthologous gene clusters between each genome pair in terms of the number of orthologous genes in conserved clusters. Finally, the neighbor-joining method is employed to construct phylogenetic trees across different species. CGCPhy has been examined on different datasets from 617 complete single-chromosome prokaryotic genomes and achieved applicative accuracies on different species sets in agreement with Bergey's taxonomy in quartet topologies. Simulation results show that CGCPhy achieves high average accuracy and has a low standard deviation on different datasets, so it has an applicative potential for phylogenetic analysis. Hindawi Publishing Corporation 2013 2013-08-29 /pmc/articles/PMC3773407/ /pubmed/24073404 http://dx.doi.org/10.1155/2013/409062 Text en Copyright © 2013 Wei Du et al. https://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Du, Wei Cao, Zhongbo Wang, Yan Sun, Ying Blanzieri, Enrico Liang, Yanchun Prokaryotic Phylogenies Inferred from Whole-Genome Sequence and Annotation Data |
title | Prokaryotic Phylogenies Inferred from Whole-Genome Sequence and Annotation Data |
title_full | Prokaryotic Phylogenies Inferred from Whole-Genome Sequence and Annotation Data |
title_fullStr | Prokaryotic Phylogenies Inferred from Whole-Genome Sequence and Annotation Data |
title_full_unstemmed | Prokaryotic Phylogenies Inferred from Whole-Genome Sequence and Annotation Data |
title_short | Prokaryotic Phylogenies Inferred from Whole-Genome Sequence and Annotation Data |
title_sort | prokaryotic phylogenies inferred from whole-genome sequence and annotation data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3773407/ https://www.ncbi.nlm.nih.gov/pubmed/24073404 http://dx.doi.org/10.1155/2013/409062 |
work_keys_str_mv | AT duwei prokaryoticphylogeniesinferredfromwholegenomesequenceandannotationdata AT caozhongbo prokaryoticphylogeniesinferredfromwholegenomesequenceandannotationdata AT wangyan prokaryoticphylogeniesinferredfromwholegenomesequenceandannotationdata AT sunying prokaryoticphylogeniesinferredfromwholegenomesequenceandannotationdata AT blanzierienrico prokaryoticphylogeniesinferredfromwholegenomesequenceandannotationdata AT liangyanchun prokaryoticphylogeniesinferredfromwholegenomesequenceandannotationdata |