Cargando…

Phyletic Profiling with Cliques of Orthologs Is Enhanced by Signatures of Paralogy Relationships

New microbial genomes are sequenced at a high pace, allowing insight into the genetics of not only cultured microbes, but a wide range of metagenomic collections such as the human microbiome. To understand the deluge of genomic data we face, computational approaches for gene functional annotation ar...

Descripción completa

Detalles Bibliográficos
Autores principales: Škunca, Nives, Bošnjak, Matko, Kriško, Anita, Panov, Panče, Džeroski, Sašo, Šmuc, Tomislav, Supek, Fran
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3536626/
https://www.ncbi.nlm.nih.gov/pubmed/23308060
http://dx.doi.org/10.1371/journal.pcbi.1002852
_version_ 1782254771984400384
author Škunca, Nives
Bošnjak, Matko
Kriško, Anita
Panov, Panče
Džeroski, Sašo
Šmuc, Tomislav
Supek, Fran
author_facet Škunca, Nives
Bošnjak, Matko
Kriško, Anita
Panov, Panče
Džeroski, Sašo
Šmuc, Tomislav
Supek, Fran
author_sort Škunca, Nives
collection PubMed
description New microbial genomes are sequenced at a high pace, allowing insight into the genetics of not only cultured microbes, but a wide range of metagenomic collections such as the human microbiome. To understand the deluge of genomic data we face, computational approaches for gene functional annotation are invaluable. We introduce a novel model for computational annotation that refines two established concepts: annotation based on homology and annotation based on phyletic profiling. The phyletic profiling-based model that includes both inferred orthologs and paralogs—homologs separated by a speciation and a duplication event, respectively—provides more annotations at the same average Precision than the model that includes only inferred orthologs. For experimental validation, we selected 38 poorly annotated Escherichia coli genes for which the model assigned one of three GO terms with high confidence: involvement in DNA repair, protein translation, or cell wall synthesis. Results of antibiotic stress survival assays on E. coli knockout mutants showed high agreement with our model's estimates of accuracy: out of 38 predictions obtained at the reported Precision of 60%, we confirmed 25 predictions, indicating that our confidence estimates can be used to make informed decisions on experimental validation. Our work will contribute to making experimental validation of computational predictions more approachable, both in cost and time. Our predictions for 998 prokaryotic genomes include ∼400000 specific annotations with the estimated Precision of 90%, ∼19000 of which are highly specific—e.g. “penicillin binding,” “tRNA aminoacylation for protein translation,” or “pathogenesis”—and are freely available at http://gorbi.irb.hr/.
format Online
Article
Text
id pubmed-3536626
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-35366262013-01-10 Phyletic Profiling with Cliques of Orthologs Is Enhanced by Signatures of Paralogy Relationships Škunca, Nives Bošnjak, Matko Kriško, Anita Panov, Panče Džeroski, Sašo Šmuc, Tomislav Supek, Fran PLoS Comput Biol Research Article New microbial genomes are sequenced at a high pace, allowing insight into the genetics of not only cultured microbes, but a wide range of metagenomic collections such as the human microbiome. To understand the deluge of genomic data we face, computational approaches for gene functional annotation are invaluable. We introduce a novel model for computational annotation that refines two established concepts: annotation based on homology and annotation based on phyletic profiling. The phyletic profiling-based model that includes both inferred orthologs and paralogs—homologs separated by a speciation and a duplication event, respectively—provides more annotations at the same average Precision than the model that includes only inferred orthologs. For experimental validation, we selected 38 poorly annotated Escherichia coli genes for which the model assigned one of three GO terms with high confidence: involvement in DNA repair, protein translation, or cell wall synthesis. Results of antibiotic stress survival assays on E. coli knockout mutants showed high agreement with our model's estimates of accuracy: out of 38 predictions obtained at the reported Precision of 60%, we confirmed 25 predictions, indicating that our confidence estimates can be used to make informed decisions on experimental validation. Our work will contribute to making experimental validation of computational predictions more approachable, both in cost and time. Our predictions for 998 prokaryotic genomes include ∼400000 specific annotations with the estimated Precision of 90%, ∼19000 of which are highly specific—e.g. “penicillin binding,” “tRNA aminoacylation for protein translation,” or “pathogenesis”—and are freely available at http://gorbi.irb.hr/. Public Library of Science 2013-01-03 /pmc/articles/PMC3536626/ /pubmed/23308060 http://dx.doi.org/10.1371/journal.pcbi.1002852 Text en © 2013 Škunca et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Škunca, Nives
Bošnjak, Matko
Kriško, Anita
Panov, Panče
Džeroski, Sašo
Šmuc, Tomislav
Supek, Fran
Phyletic Profiling with Cliques of Orthologs Is Enhanced by Signatures of Paralogy Relationships
title Phyletic Profiling with Cliques of Orthologs Is Enhanced by Signatures of Paralogy Relationships
title_full Phyletic Profiling with Cliques of Orthologs Is Enhanced by Signatures of Paralogy Relationships
title_fullStr Phyletic Profiling with Cliques of Orthologs Is Enhanced by Signatures of Paralogy Relationships
title_full_unstemmed Phyletic Profiling with Cliques of Orthologs Is Enhanced by Signatures of Paralogy Relationships
title_short Phyletic Profiling with Cliques of Orthologs Is Enhanced by Signatures of Paralogy Relationships
title_sort phyletic profiling with cliques of orthologs is enhanced by signatures of paralogy relationships
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3536626/
https://www.ncbi.nlm.nih.gov/pubmed/23308060
http://dx.doi.org/10.1371/journal.pcbi.1002852
work_keys_str_mv AT skuncanives phyleticprofilingwithcliquesoforthologsisenhancedbysignaturesofparalogyrelationships
AT bosnjakmatko phyleticprofilingwithcliquesoforthologsisenhancedbysignaturesofparalogyrelationships
AT kriskoanita phyleticprofilingwithcliquesoforthologsisenhancedbysignaturesofparalogyrelationships
AT panovpance phyleticprofilingwithcliquesoforthologsisenhancedbysignaturesofparalogyrelationships
AT dzeroskisaso phyleticprofilingwithcliquesoforthologsisenhancedbysignaturesofparalogyrelationships
AT smuctomislav phyleticprofilingwithcliquesoforthologsisenhancedbysignaturesofparalogyrelationships
AT supekfran phyleticprofilingwithcliquesoforthologsisenhancedbysignaturesofparalogyrelationships