Cargando…

Denoising inferred functional association networks obtained by gene fusion analysis

BACKGROUND: Gene fusion detection – also known as the 'Rosetta Stone' method – involves the identification of fused composite genes in a set of reference genomes, which indicates potential interactions between its un-fused counterpart genes in query genomes. The precision of this method ty...

Descripción completa

Detalles Bibliográficos
Autores principales: Kamburov, Atanas, Goldovsky, Leon, Freilich, Shiri, Kapazoglou, Aliki, Kunin, Victor, Enright, Anton J, Tsaftaris, Athanasios, Ouzounis, Christos A
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2248599/
https://www.ncbi.nlm.nih.gov/pubmed/18081932
http://dx.doi.org/10.1186/1471-2164-8-460
_version_ 1782151018397564928
author Kamburov, Atanas
Goldovsky, Leon
Freilich, Shiri
Kapazoglou, Aliki
Kunin, Victor
Enright, Anton J
Tsaftaris, Athanasios
Ouzounis, Christos A
author_facet Kamburov, Atanas
Goldovsky, Leon
Freilich, Shiri
Kapazoglou, Aliki
Kunin, Victor
Enright, Anton J
Tsaftaris, Athanasios
Ouzounis, Christos A
author_sort Kamburov, Atanas
collection PubMed
description BACKGROUND: Gene fusion detection – also known as the 'Rosetta Stone' method – involves the identification of fused composite genes in a set of reference genomes, which indicates potential interactions between its un-fused counterpart genes in query genomes. The precision of this method typically improves with an ever-increasing number of reference genomes. RESULTS: In order to explore the usefulness and scope of this approach for protein interaction prediction and generate a high-quality, non-redundant set of interacting pairs of proteins across a wide taxonomic range, we have exhaustively performed gene fusion analysis for 184 genomes using an efficient variant of a previously developed protocol. By analyzing interaction graphs and applying a threshold that limits the maximum number of possible interactions within the largest graph components, we show that we can reduce the number of implausible interactions due to the detection of promiscuous domains. With this generally applicable approach, we generate a robust set of over 2 million distinct and testable interactions encompassing 696,894 proteins in 184 species or strains, most of which have never been the subject of high-throughput experimental proteomics. We investigate the cumulative effect of increasing numbers of genomes on the fidelity and quantity of predictions, and show that, for large numbers of genomes, predictions do not become saturated but continue to grow linearly, for the majority of the species. We also examine the percentage of component (and composite) proteins with relation to the number of genes and further validate the functional categories that are highly represented in this robust set of detected genome-wide interactions. CONCLUSION: We illustrate the phylogenetic and functional diversity of gene fusion events across genomes, and their usefulness for accurate prediction of protein interaction and function.
format Text
id pubmed-2248599
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-22485992008-02-21 Denoising inferred functional association networks obtained by gene fusion analysis Kamburov, Atanas Goldovsky, Leon Freilich, Shiri Kapazoglou, Aliki Kunin, Victor Enright, Anton J Tsaftaris, Athanasios Ouzounis, Christos A BMC Genomics Research Article BACKGROUND: Gene fusion detection – also known as the 'Rosetta Stone' method – involves the identification of fused composite genes in a set of reference genomes, which indicates potential interactions between its un-fused counterpart genes in query genomes. The precision of this method typically improves with an ever-increasing number of reference genomes. RESULTS: In order to explore the usefulness and scope of this approach for protein interaction prediction and generate a high-quality, non-redundant set of interacting pairs of proteins across a wide taxonomic range, we have exhaustively performed gene fusion analysis for 184 genomes using an efficient variant of a previously developed protocol. By analyzing interaction graphs and applying a threshold that limits the maximum number of possible interactions within the largest graph components, we show that we can reduce the number of implausible interactions due to the detection of promiscuous domains. With this generally applicable approach, we generate a robust set of over 2 million distinct and testable interactions encompassing 696,894 proteins in 184 species or strains, most of which have never been the subject of high-throughput experimental proteomics. We investigate the cumulative effect of increasing numbers of genomes on the fidelity and quantity of predictions, and show that, for large numbers of genomes, predictions do not become saturated but continue to grow linearly, for the majority of the species. We also examine the percentage of component (and composite) proteins with relation to the number of genes and further validate the functional categories that are highly represented in this robust set of detected genome-wide interactions. CONCLUSION: We illustrate the phylogenetic and functional diversity of gene fusion events across genomes, and their usefulness for accurate prediction of protein interaction and function. BioMed Central 2007-12-14 /pmc/articles/PMC2248599/ /pubmed/18081932 http://dx.doi.org/10.1186/1471-2164-8-460 Text en Copyright © 2007 Kamburov et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Kamburov, Atanas
Goldovsky, Leon
Freilich, Shiri
Kapazoglou, Aliki
Kunin, Victor
Enright, Anton J
Tsaftaris, Athanasios
Ouzounis, Christos A
Denoising inferred functional association networks obtained by gene fusion analysis
title Denoising inferred functional association networks obtained by gene fusion analysis
title_full Denoising inferred functional association networks obtained by gene fusion analysis
title_fullStr Denoising inferred functional association networks obtained by gene fusion analysis
title_full_unstemmed Denoising inferred functional association networks obtained by gene fusion analysis
title_short Denoising inferred functional association networks obtained by gene fusion analysis
title_sort denoising inferred functional association networks obtained by gene fusion analysis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2248599/
https://www.ncbi.nlm.nih.gov/pubmed/18081932
http://dx.doi.org/10.1186/1471-2164-8-460
work_keys_str_mv AT kamburovatanas denoisinginferredfunctionalassociationnetworksobtainedbygenefusionanalysis
AT goldovskyleon denoisinginferredfunctionalassociationnetworksobtainedbygenefusionanalysis
AT freilichshiri denoisinginferredfunctionalassociationnetworksobtainedbygenefusionanalysis
AT kapazogloualiki denoisinginferredfunctionalassociationnetworksobtainedbygenefusionanalysis
AT kuninvictor denoisinginferredfunctionalassociationnetworksobtainedbygenefusionanalysis
AT enrightantonj denoisinginferredfunctionalassociationnetworksobtainedbygenefusionanalysis
AT tsaftarisathanasios denoisinginferredfunctionalassociationnetworksobtainedbygenefusionanalysis
AT ouzounischristosa denoisinginferredfunctionalassociationnetworksobtainedbygenefusionanalysis