Cargando…

An improved hypergeometric probability method for identification of functionally linked proteins using phylogenetic profiles

Predicting functions of proteins and alternatively spliced isoforms encoded in a genome is one of the important applications of bioinformatics in the post-genome era. Due to the practical limitation of experimental characterization of all proteins encoded in a genome using biochemical studies, bioin...

Descripción completa

Detalles Bibliográficos
Autores principales: Kotaru, Appala Raju, Shameer, Khader, Sundaramurthy, Pandurangan, Joshi, Ramesh Chandra
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Biomedical Informatics 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3669790/
https://www.ncbi.nlm.nih.gov/pubmed/23750082
http://dx.doi.org/10.6026/97320630009368
_version_ 1782271803976056832
author Kotaru, Appala Raju
Shameer, Khader
Sundaramurthy, Pandurangan
Joshi, Ramesh Chandra
author_facet Kotaru, Appala Raju
Shameer, Khader
Sundaramurthy, Pandurangan
Joshi, Ramesh Chandra
author_sort Kotaru, Appala Raju
collection PubMed
description Predicting functions of proteins and alternatively spliced isoforms encoded in a genome is one of the important applications of bioinformatics in the post-genome era. Due to the practical limitation of experimental characterization of all proteins encoded in a genome using biochemical studies, bioinformatics methods provide powerful tools for function annotation and prediction. These methods also help minimize the growing sequence-to-function gap. Phylogenetic profiling is a bioinformatics approach to identify the influence of a trait across species and can be employed to infer the evolutionary history of proteins encoded in genomes. Here we propose an improved phylogenetic profile-based method which considers the co-evolution of the reference genome to derive the basic similarity measure, the background phylogeny of target genomes for profile generation and assigning weights to target genomes. The ordering of genomes and the runs of consecutive matches between the proteins were used to define phylogenetic relationships in the approach. We used Escherichia coli K12 genome as the reference genome and its 4195 proteins were used in the current analysis. We compared our approach with two existing methods and our initial results show that the predictions have outperformed two of the existing approaches. In addition, we have validated our method using a targeted protein-protein interaction network derived from protein-protein interaction database STRING. Our preliminary results indicates that improvement in function prediction can be attained by using coevolution-based similarity measures and the runs on to the same scale instead of computing them in different scales. Our method can be applied at the whole-genome level for annotating hypothetical proteins from prokaryotic genomes.
format Online
Article
Text
id pubmed-3669790
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Biomedical Informatics
record_format MEDLINE/PubMed
spelling pubmed-36697902013-06-07 An improved hypergeometric probability method for identification of functionally linked proteins using phylogenetic profiles Kotaru, Appala Raju Shameer, Khader Sundaramurthy, Pandurangan Joshi, Ramesh Chandra Bioinformation Hypothesis Predicting functions of proteins and alternatively spliced isoforms encoded in a genome is one of the important applications of bioinformatics in the post-genome era. Due to the practical limitation of experimental characterization of all proteins encoded in a genome using biochemical studies, bioinformatics methods provide powerful tools for function annotation and prediction. These methods also help minimize the growing sequence-to-function gap. Phylogenetic profiling is a bioinformatics approach to identify the influence of a trait across species and can be employed to infer the evolutionary history of proteins encoded in genomes. Here we propose an improved phylogenetic profile-based method which considers the co-evolution of the reference genome to derive the basic similarity measure, the background phylogeny of target genomes for profile generation and assigning weights to target genomes. The ordering of genomes and the runs of consecutive matches between the proteins were used to define phylogenetic relationships in the approach. We used Escherichia coli K12 genome as the reference genome and its 4195 proteins were used in the current analysis. We compared our approach with two existing methods and our initial results show that the predictions have outperformed two of the existing approaches. In addition, we have validated our method using a targeted protein-protein interaction network derived from protein-protein interaction database STRING. Our preliminary results indicates that improvement in function prediction can be attained by using coevolution-based similarity measures and the runs on to the same scale instead of computing them in different scales. Our method can be applied at the whole-genome level for annotating hypothetical proteins from prokaryotic genomes. Biomedical Informatics 2013-04-13 /pmc/articles/PMC3669790/ /pubmed/23750082 http://dx.doi.org/10.6026/97320630009368 Text en © 2013 Biomedical Informatics This is an open-access article, which permits unrestricted use, distribution, and reproduction in any medium, for non-commercial purposes, provided the original author and source are credited.
spellingShingle Hypothesis
Kotaru, Appala Raju
Shameer, Khader
Sundaramurthy, Pandurangan
Joshi, Ramesh Chandra
An improved hypergeometric probability method for identification of functionally linked proteins using phylogenetic profiles
title An improved hypergeometric probability method for identification of functionally linked proteins using phylogenetic profiles
title_full An improved hypergeometric probability method for identification of functionally linked proteins using phylogenetic profiles
title_fullStr An improved hypergeometric probability method for identification of functionally linked proteins using phylogenetic profiles
title_full_unstemmed An improved hypergeometric probability method for identification of functionally linked proteins using phylogenetic profiles
title_short An improved hypergeometric probability method for identification of functionally linked proteins using phylogenetic profiles
title_sort improved hypergeometric probability method for identification of functionally linked proteins using phylogenetic profiles
topic Hypothesis
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3669790/
https://www.ncbi.nlm.nih.gov/pubmed/23750082
http://dx.doi.org/10.6026/97320630009368
work_keys_str_mv AT kotaruappalaraju animprovedhypergeometricprobabilitymethodforidentificationoffunctionallylinkedproteinsusingphylogeneticprofiles
AT shameerkhader animprovedhypergeometricprobabilitymethodforidentificationoffunctionallylinkedproteinsusingphylogeneticprofiles
AT sundaramurthypandurangan animprovedhypergeometricprobabilitymethodforidentificationoffunctionallylinkedproteinsusingphylogeneticprofiles
AT joshirameshchandra animprovedhypergeometricprobabilitymethodforidentificationoffunctionallylinkedproteinsusingphylogeneticprofiles
AT kotaruappalaraju improvedhypergeometricprobabilitymethodforidentificationoffunctionallylinkedproteinsusingphylogeneticprofiles
AT shameerkhader improvedhypergeometricprobabilitymethodforidentificationoffunctionallylinkedproteinsusingphylogeneticprofiles
AT sundaramurthypandurangan improvedhypergeometricprobabilitymethodforidentificationoffunctionallylinkedproteinsusingphylogeneticprofiles
AT joshirameshchandra improvedhypergeometricprobabilitymethodforidentificationoffunctionallylinkedproteinsusingphylogeneticprofiles