Cargando…

Detection of Genomic Idiosyncrasies Using Fuzzy Phylogenetic Profiles

Phylogenetic profiles express the presence or absence of genes and their homologs across a number of reference genomes. They have emerged as an elegant representation framework for comparative genomics and have been used for the genome-wide inference and discovery of functionally linked genes or met...

Descripción completa

Detalles Bibliográficos
Autores principales: Psomopoulos, Fotis E., Mitkas, Pericles A., Ouzounis, Christos A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3544837/
https://www.ncbi.nlm.nih.gov/pubmed/23341912
http://dx.doi.org/10.1371/journal.pone.0052854
_version_ 1782255860384268288
author Psomopoulos, Fotis E.
Mitkas, Pericles A.
Ouzounis, Christos A.
author_facet Psomopoulos, Fotis E.
Mitkas, Pericles A.
Ouzounis, Christos A.
author_sort Psomopoulos, Fotis E.
collection PubMed
description Phylogenetic profiles express the presence or absence of genes and their homologs across a number of reference genomes. They have emerged as an elegant representation framework for comparative genomics and have been used for the genome-wide inference and discovery of functionally linked genes or metabolic pathways. As the number of reference genomes grows, there is an acute need for faster and more accurate methods for phylogenetic profile analysis with increased performance in speed and quality. We propose a novel, efficient method for the detection of genomic idiosyncrasies, i.e. sets of genes found in a specific genome with peculiar phylogenetic properties, such as intra-genome correlations or inter-genome relationships. Our algorithm is a four-step process where genome profiles are first defined as fuzzy vectors, then discretized to binary vectors, followed by a de-noising step, and finally a comparison step to generate intra- and inter-genome distances for each gene profile. The method is validated with a carefully selected benchmark set of five reference genomes, using a range of approaches regarding similarity metrics and pre-processing stages for noise reduction. We demonstrate that the fuzzy profile method consistently identifies the actual phylogenetic relationship and origin of the genes under consideration for the majority of the cases, while the detected outliers are found to be particular genes with peculiar phylogenetic patterns. The proposed method provides a time-efficient and highly scalable approach for phylogenetic stratification, with the detected groups of genes being either similar to their own genome profile or different from it, thus revealing atypical evolutionary histories.
format Online
Article
Text
id pubmed-3544837
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-35448372013-01-22 Detection of Genomic Idiosyncrasies Using Fuzzy Phylogenetic Profiles Psomopoulos, Fotis E. Mitkas, Pericles A. Ouzounis, Christos A. PLoS One Research Article Phylogenetic profiles express the presence or absence of genes and their homologs across a number of reference genomes. They have emerged as an elegant representation framework for comparative genomics and have been used for the genome-wide inference and discovery of functionally linked genes or metabolic pathways. As the number of reference genomes grows, there is an acute need for faster and more accurate methods for phylogenetic profile analysis with increased performance in speed and quality. We propose a novel, efficient method for the detection of genomic idiosyncrasies, i.e. sets of genes found in a specific genome with peculiar phylogenetic properties, such as intra-genome correlations or inter-genome relationships. Our algorithm is a four-step process where genome profiles are first defined as fuzzy vectors, then discretized to binary vectors, followed by a de-noising step, and finally a comparison step to generate intra- and inter-genome distances for each gene profile. The method is validated with a carefully selected benchmark set of five reference genomes, using a range of approaches regarding similarity metrics and pre-processing stages for noise reduction. We demonstrate that the fuzzy profile method consistently identifies the actual phylogenetic relationship and origin of the genes under consideration for the majority of the cases, while the detected outliers are found to be particular genes with peculiar phylogenetic patterns. The proposed method provides a time-efficient and highly scalable approach for phylogenetic stratification, with the detected groups of genes being either similar to their own genome profile or different from it, thus revealing atypical evolutionary histories. Public Library of Science 2013-01-14 /pmc/articles/PMC3544837/ /pubmed/23341912 http://dx.doi.org/10.1371/journal.pone.0052854 Text en © 2013 Psomopoulos et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Psomopoulos, Fotis E.
Mitkas, Pericles A.
Ouzounis, Christos A.
Detection of Genomic Idiosyncrasies Using Fuzzy Phylogenetic Profiles
title Detection of Genomic Idiosyncrasies Using Fuzzy Phylogenetic Profiles
title_full Detection of Genomic Idiosyncrasies Using Fuzzy Phylogenetic Profiles
title_fullStr Detection of Genomic Idiosyncrasies Using Fuzzy Phylogenetic Profiles
title_full_unstemmed Detection of Genomic Idiosyncrasies Using Fuzzy Phylogenetic Profiles
title_short Detection of Genomic Idiosyncrasies Using Fuzzy Phylogenetic Profiles
title_sort detection of genomic idiosyncrasies using fuzzy phylogenetic profiles
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3544837/
https://www.ncbi.nlm.nih.gov/pubmed/23341912
http://dx.doi.org/10.1371/journal.pone.0052854
work_keys_str_mv AT psomopoulosfotise detectionofgenomicidiosyncrasiesusingfuzzyphylogeneticprofiles
AT mitkaspericlesa detectionofgenomicidiosyncrasiesusingfuzzyphylogeneticprofiles
AT ouzounischristosa detectionofgenomicidiosyncrasiesusingfuzzyphylogeneticprofiles