Cargando…

Ontology-guided data preparation for discovering genotype-phenotype relationships

BACKGROUND: Complexity and amount of post-genomic data constitute two major factors limiting the application of Knowledge Discovery in Databases (KDD) methods in life sciences. Bio-ontologies may nowadays play key roles in knowledge discovery in life science providing semantics to data and to extrac...

Descripción completa

Detalles Bibliográficos
Autores principales: Coulet, Adrien, Smaïl-Tabbone, Malika, Benlian, Pascale, Napoli, Amedeo, Devignes, Marie-Dominique
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2367630/
https://www.ncbi.nlm.nih.gov/pubmed/18460176
http://dx.doi.org/10.1186/1471-2105-9-S4-S3
_version_ 1782154337915502592
author Coulet, Adrien
Smaïl-Tabbone, Malika
Benlian, Pascale
Napoli, Amedeo
Devignes, Marie-Dominique
author_facet Coulet, Adrien
Smaïl-Tabbone, Malika
Benlian, Pascale
Napoli, Amedeo
Devignes, Marie-Dominique
author_sort Coulet, Adrien
collection PubMed
description BACKGROUND: Complexity and amount of post-genomic data constitute two major factors limiting the application of Knowledge Discovery in Databases (KDD) methods in life sciences. Bio-ontologies may nowadays play key roles in knowledge discovery in life science providing semantics to data and to extracted units, by taking advantage of the progress of Semantic Web technologies concerning the understanding and availability of tools for knowledge representation, extraction, and reasoning. RESULTS: This paper presents a method that exploits bio-ontologies for guiding data selection within the preparation step of the KDD process. We propose three scenarios in which domain knowledge and ontology elements such as subsumption, properties, class descriptions, are taken into account for data selection, before the data mining step. Each of these scenarios is illustrated within a case-study relative to the search of genotype-phenotype relationships in a familial hypercholesterolemia dataset. The guiding of data selection based on domain knowledge is analysed and shows a direct influence on the volume and significance of the data mining results. CONCLUSIONS: The method proposed in this paper is an efficient alternative to numerical methods for data selection based on domain knowledge. In turn, the results of this study may be reused in ontology modelling and data integration.
format Text
id pubmed-2367630
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-23676302008-05-07 Ontology-guided data preparation for discovering genotype-phenotype relationships Coulet, Adrien Smaïl-Tabbone, Malika Benlian, Pascale Napoli, Amedeo Devignes, Marie-Dominique BMC Bioinformatics Research BACKGROUND: Complexity and amount of post-genomic data constitute two major factors limiting the application of Knowledge Discovery in Databases (KDD) methods in life sciences. Bio-ontologies may nowadays play key roles in knowledge discovery in life science providing semantics to data and to extracted units, by taking advantage of the progress of Semantic Web technologies concerning the understanding and availability of tools for knowledge representation, extraction, and reasoning. RESULTS: This paper presents a method that exploits bio-ontologies for guiding data selection within the preparation step of the KDD process. We propose three scenarios in which domain knowledge and ontology elements such as subsumption, properties, class descriptions, are taken into account for data selection, before the data mining step. Each of these scenarios is illustrated within a case-study relative to the search of genotype-phenotype relationships in a familial hypercholesterolemia dataset. The guiding of data selection based on domain knowledge is analysed and shows a direct influence on the volume and significance of the data mining results. CONCLUSIONS: The method proposed in this paper is an efficient alternative to numerical methods for data selection based on domain knowledge. In turn, the results of this study may be reused in ontology modelling and data integration. BioMed Central 2008-04-25 /pmc/articles/PMC2367630/ /pubmed/18460176 http://dx.doi.org/10.1186/1471-2105-9-S4-S3 Text en Copyright © 2008 Coulet et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Coulet, Adrien
Smaïl-Tabbone, Malika
Benlian, Pascale
Napoli, Amedeo
Devignes, Marie-Dominique
Ontology-guided data preparation for discovering genotype-phenotype relationships
title Ontology-guided data preparation for discovering genotype-phenotype relationships
title_full Ontology-guided data preparation for discovering genotype-phenotype relationships
title_fullStr Ontology-guided data preparation for discovering genotype-phenotype relationships
title_full_unstemmed Ontology-guided data preparation for discovering genotype-phenotype relationships
title_short Ontology-guided data preparation for discovering genotype-phenotype relationships
title_sort ontology-guided data preparation for discovering genotype-phenotype relationships
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2367630/
https://www.ncbi.nlm.nih.gov/pubmed/18460176
http://dx.doi.org/10.1186/1471-2105-9-S4-S3
work_keys_str_mv AT couletadrien ontologyguideddatapreparationfordiscoveringgenotypephenotyperelationships
AT smailtabbonemalika ontologyguideddatapreparationfordiscoveringgenotypephenotyperelationships
AT benlianpascale ontologyguideddatapreparationfordiscoveringgenotypephenotyperelationships
AT napoliamedeo ontologyguideddatapreparationfordiscoveringgenotypephenotyperelationships
AT devignesmariedominique ontologyguideddatapreparationfordiscoveringgenotypephenotyperelationships