Cargando…

PlasmoDraft: a database of Plasmodium falciparum gene function predictions based on postgenomic data

BACKGROUND: Of the 5 484 predicted proteins of Plasmodium falciparum, the main causative agent of malaria, about 60% do not have sufficient sequence similarity with proteins in other organisms to warrant provision of functional assignments. Non-homology methods are thus needed to obtain functional c...

Descripción completa

Detalles Bibliográficos
Autores principales: Bréhélin, Laurent, Dufayard, Jean-François, Gascuel, Olivier
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2605471/
https://www.ncbi.nlm.nih.gov/pubmed/18925948
http://dx.doi.org/10.1186/1471-2105-9-440
_version_ 1782162857724477440
author Bréhélin, Laurent
Dufayard, Jean-François
Gascuel, Olivier
author_facet Bréhélin, Laurent
Dufayard, Jean-François
Gascuel, Olivier
author_sort Bréhélin, Laurent
collection PubMed
description BACKGROUND: Of the 5 484 predicted proteins of Plasmodium falciparum, the main causative agent of malaria, about 60% do not have sufficient sequence similarity with proteins in other organisms to warrant provision of functional assignments. Non-homology methods are thus needed to obtain functional clues for these uncharacterized genes. RESULTS: We present PlasmoDraft , a database of Gene Ontology (GO) annotation predictions for P. falciparum genes based on postgenomic data. Predictions of PlasmoDraft are achieved with a Guilt By Association method named Gonna. This involves (1) a predictor that proposes GO annotations for a gene based on the similarity of its profile (measured with transcriptome, proteome or interactome data) with genes already annotated by GeneDB; (2) a procedure that estimates the confidence of the predictions achieved with each data source; (3) a procedure that combines all data sources to provide a global summary and confidence estimate of the predictions. Gonna has been applied to all P. falciparum genes using most publicly available transcriptome, proteome and interactome data sources. Gonna provides predictions for numerous genes without any annotations. For example, 2 434 genes without any annotations in the Biological Process ontology are associated with specific GO terms (e.g. Rosetting, Antigenic variation), and among these, 841 have confidence values above 50%. In the Cellular Component and Molecular Function ontologies, 1 905 and 1 540 uncharacterized genes are associated with specific GO terms, respectively (740 and 329 with confidence value above 50%). CONCLUSION: All predictions along with their confidence values have been compiled in PlasmoDraft, which thus provides an extensive database of GO annotation predictions that can be achieved with these data sources. The database can be accessed in different ways. A global view allows for a quick inspection of the GO terms that are predicted with high confidence, depending on the various data sources. A gene view and a GO term view allow for the search of potential GO terms attached to a given gene, and genes that potentially belong to a given GO term.
format Text
id pubmed-2605471
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26054712008-12-19 PlasmoDraft: a database of Plasmodium falciparum gene function predictions based on postgenomic data Bréhélin, Laurent Dufayard, Jean-François Gascuel, Olivier BMC Bioinformatics Research Article BACKGROUND: Of the 5 484 predicted proteins of Plasmodium falciparum, the main causative agent of malaria, about 60% do not have sufficient sequence similarity with proteins in other organisms to warrant provision of functional assignments. Non-homology methods are thus needed to obtain functional clues for these uncharacterized genes. RESULTS: We present PlasmoDraft , a database of Gene Ontology (GO) annotation predictions for P. falciparum genes based on postgenomic data. Predictions of PlasmoDraft are achieved with a Guilt By Association method named Gonna. This involves (1) a predictor that proposes GO annotations for a gene based on the similarity of its profile (measured with transcriptome, proteome or interactome data) with genes already annotated by GeneDB; (2) a procedure that estimates the confidence of the predictions achieved with each data source; (3) a procedure that combines all data sources to provide a global summary and confidence estimate of the predictions. Gonna has been applied to all P. falciparum genes using most publicly available transcriptome, proteome and interactome data sources. Gonna provides predictions for numerous genes without any annotations. For example, 2 434 genes without any annotations in the Biological Process ontology are associated with specific GO terms (e.g. Rosetting, Antigenic variation), and among these, 841 have confidence values above 50%. In the Cellular Component and Molecular Function ontologies, 1 905 and 1 540 uncharacterized genes are associated with specific GO terms, respectively (740 and 329 with confidence value above 50%). CONCLUSION: All predictions along with their confidence values have been compiled in PlasmoDraft, which thus provides an extensive database of GO annotation predictions that can be achieved with these data sources. The database can be accessed in different ways. A global view allows for a quick inspection of the GO terms that are predicted with high confidence, depending on the various data sources. A gene view and a GO term view allow for the search of potential GO terms attached to a given gene, and genes that potentially belong to a given GO term. BioMed Central 2008-10-16 /pmc/articles/PMC2605471/ /pubmed/18925948 http://dx.doi.org/10.1186/1471-2105-9-440 Text en Copyright © 2008 Bréhélin et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Bréhélin, Laurent
Dufayard, Jean-François
Gascuel, Olivier
PlasmoDraft: a database of Plasmodium falciparum gene function predictions based on postgenomic data
title PlasmoDraft: a database of Plasmodium falciparum gene function predictions based on postgenomic data
title_full PlasmoDraft: a database of Plasmodium falciparum gene function predictions based on postgenomic data
title_fullStr PlasmoDraft: a database of Plasmodium falciparum gene function predictions based on postgenomic data
title_full_unstemmed PlasmoDraft: a database of Plasmodium falciparum gene function predictions based on postgenomic data
title_short PlasmoDraft: a database of Plasmodium falciparum gene function predictions based on postgenomic data
title_sort plasmodraft: a database of plasmodium falciparum gene function predictions based on postgenomic data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2605471/
https://www.ncbi.nlm.nih.gov/pubmed/18925948
http://dx.doi.org/10.1186/1471-2105-9-440
work_keys_str_mv AT brehelinlaurent plasmodraftadatabaseofplasmodiumfalciparumgenefunctionpredictionsbasedonpostgenomicdata
AT dufayardjeanfrancois plasmodraftadatabaseofplasmodiumfalciparumgenefunctionpredictionsbasedonpostgenomicdata
AT gascuelolivier plasmodraftadatabaseofplasmodiumfalciparumgenefunctionpredictionsbasedonpostgenomicdata