Cargando…

Towards a semi-automatic functional annotation tool based on decision-tree techniques

BACKGROUND: Due to the continuous improvements of high throughput technologies and experimental procedures, the number of sequenced genomes is increasing exponentially. Ultimately, the task of annotating these data relies on the expertise of biologists. The necessity for annotation to be supervised...

Descripción completa

Detalles Bibliográficos
Autores principales: Azé, Jérôme, Gentils, Lucie, Toffano-Nioche, Claire, Loux, Valentin, Gibrat, Jean-François, Bessières, Philippe, Rouveirol, Céline, Poupon, Anne, Froidevaux, Christine
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2654970/
https://www.ncbi.nlm.nih.gov/pubmed/19091050
_version_ 1782165426517573632
author Azé, Jérôme
Gentils, Lucie
Toffano-Nioche, Claire
Loux, Valentin
Gibrat, Jean-François
Bessières, Philippe
Rouveirol, Céline
Poupon, Anne
Froidevaux, Christine
author_facet Azé, Jérôme
Gentils, Lucie
Toffano-Nioche, Claire
Loux, Valentin
Gibrat, Jean-François
Bessières, Philippe
Rouveirol, Céline
Poupon, Anne
Froidevaux, Christine
author_sort Azé, Jérôme
collection PubMed
description BACKGROUND: Due to the continuous improvements of high throughput technologies and experimental procedures, the number of sequenced genomes is increasing exponentially. Ultimately, the task of annotating these data relies on the expertise of biologists. The necessity for annotation to be supervised by human experts is the rate limiting step of the data analysis. To face the deluge of new genomic data, the need for automating, as much as possible, the annotation process becomes critical. RESULTS: We consider annotation of a protein with terms of the functional hierarchy that has been used to annotate Bacillus subtilis and propose a set of rules that predict classes in terms of elements of the functional hierarchy, i.e., a class is a node or a leaf of the hierarchy tree. The rules are obtained through two decision-trees techniques: first-order decision-trees and multilabel attribute-value decision-trees, by using as training data the proteins from two lactic bacteria: Lactobacillus sakei and Lactobacillus bulgaricus. We tested the two methods, first independently, then in a combined approach, and evaluated the obtained results using hierarchical evaluation measures. Results obtained for the two approaches on both genomes are comparable and show a good precision together with a high prediction rate. Using combined approaches increases the recall and the prediction rate. CONCLUSION: The combination of the two approaches is very encouraging and we will further refine these combinations in order to get rules even more useful for the annotators. This first study is a crucial step towards designing a semi-automatic functional annotation tool.
format Text
id pubmed-2654970
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26549702009-03-13 Towards a semi-automatic functional annotation tool based on decision-tree techniques Azé, Jérôme Gentils, Lucie Toffano-Nioche, Claire Loux, Valentin Gibrat, Jean-François Bessières, Philippe Rouveirol, Céline Poupon, Anne Froidevaux, Christine BMC Proc Proceedings BACKGROUND: Due to the continuous improvements of high throughput technologies and experimental procedures, the number of sequenced genomes is increasing exponentially. Ultimately, the task of annotating these data relies on the expertise of biologists. The necessity for annotation to be supervised by human experts is the rate limiting step of the data analysis. To face the deluge of new genomic data, the need for automating, as much as possible, the annotation process becomes critical. RESULTS: We consider annotation of a protein with terms of the functional hierarchy that has been used to annotate Bacillus subtilis and propose a set of rules that predict classes in terms of elements of the functional hierarchy, i.e., a class is a node or a leaf of the hierarchy tree. The rules are obtained through two decision-trees techniques: first-order decision-trees and multilabel attribute-value decision-trees, by using as training data the proteins from two lactic bacteria: Lactobacillus sakei and Lactobacillus bulgaricus. We tested the two methods, first independently, then in a combined approach, and evaluated the obtained results using hierarchical evaluation measures. Results obtained for the two approaches on both genomes are comparable and show a good precision together with a high prediction rate. Using combined approaches increases the recall and the prediction rate. CONCLUSION: The combination of the two approaches is very encouraging and we will further refine these combinations in order to get rules even more useful for the annotators. This first study is a crucial step towards designing a semi-automatic functional annotation tool. BioMed Central 2008-12-17 /pmc/articles/PMC2654970/ /pubmed/19091050 Text en Copyright © 2008 Azé et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Azé, Jérôme
Gentils, Lucie
Toffano-Nioche, Claire
Loux, Valentin
Gibrat, Jean-François
Bessières, Philippe
Rouveirol, Céline
Poupon, Anne
Froidevaux, Christine
Towards a semi-automatic functional annotation tool based on decision-tree techniques
title Towards a semi-automatic functional annotation tool based on decision-tree techniques
title_full Towards a semi-automatic functional annotation tool based on decision-tree techniques
title_fullStr Towards a semi-automatic functional annotation tool based on decision-tree techniques
title_full_unstemmed Towards a semi-automatic functional annotation tool based on decision-tree techniques
title_short Towards a semi-automatic functional annotation tool based on decision-tree techniques
title_sort towards a semi-automatic functional annotation tool based on decision-tree techniques
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2654970/
https://www.ncbi.nlm.nih.gov/pubmed/19091050
work_keys_str_mv AT azejerome towardsasemiautomaticfunctionalannotationtoolbasedondecisiontreetechniques
AT gentilslucie towardsasemiautomaticfunctionalannotationtoolbasedondecisiontreetechniques
AT toffanoniocheclaire towardsasemiautomaticfunctionalannotationtoolbasedondecisiontreetechniques
AT louxvalentin towardsasemiautomaticfunctionalannotationtoolbasedondecisiontreetechniques
AT gibratjeanfrancois towardsasemiautomaticfunctionalannotationtoolbasedondecisiontreetechniques
AT bessieresphilippe towardsasemiautomaticfunctionalannotationtoolbasedondecisiontreetechniques
AT rouveirolceline towardsasemiautomaticfunctionalannotationtoolbasedondecisiontreetechniques
AT pouponanne towardsasemiautomaticfunctionalannotationtoolbasedondecisiontreetechniques
AT froidevauxchristine towardsasemiautomaticfunctionalannotationtoolbasedondecisiontreetechniques