Cargando…

Integration and publication of heterogeneous text-mined relationships on the Semantic Web

BACKGROUND: Advances in Natural Language Processing (NLP) techniques enable the extraction of fine-grained relationships mentioned in biomedical text. The variability and the complexity of natural language in expressing similar relationships causes the extracted relationships to be highly heterogene...

Descripción completa

Detalles Bibliográficos
Autores principales: Coulet, Adrien, Garten, Yael, Dumontier, Michel, Altman, Russ B, Musen, Mark A, Shah, Nigam H
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3102890/
https://www.ncbi.nlm.nih.gov/pubmed/21624156
http://dx.doi.org/10.1186/2041-1480-2-S2-S10
_version_ 1782204447342985216
author Coulet, Adrien
Garten, Yael
Dumontier, Michel
Altman, Russ B
Musen, Mark A
Shah, Nigam H
author_facet Coulet, Adrien
Garten, Yael
Dumontier, Michel
Altman, Russ B
Musen, Mark A
Shah, Nigam H
author_sort Coulet, Adrien
collection PubMed
description BACKGROUND: Advances in Natural Language Processing (NLP) techniques enable the extraction of fine-grained relationships mentioned in biomedical text. The variability and the complexity of natural language in expressing similar relationships causes the extracted relationships to be highly heterogeneous, which makes the construction of knowledge bases difficult and poses a challenge in using these for data mining or question answering. RESULTS: We report on the semi-automatic construction of the PHARE relationship ontology (the PHArmacogenomic RElationships Ontology) consisting of 200 curated relations from over 40,000 heterogeneous relationships extracted via text-mining. These heterogeneous relations are then mapped to the PHARE ontology using synonyms, entity descriptions and hierarchies of entities and roles. Once mapped, relationships can be normalized and compared using the structure of the ontology to identify relationships that have similar semantics but different syntax. We compare and contrast the manual procedure with a fully automated approach using WordNet to quantify the degree of integration enabled by iterative curation and refinement of the PHARE ontology. The result of such integration is a repository of normalized biomedical relationships, named PHARE-KB, which can be queried using Semantic Web technologies such as SPARQL and can be visualized in the form of a biological network. CONCLUSIONS: The PHARE ontology serves as a common semantic framework to integrate more than 40,000 relationships pertinent to pharmacogenomics. The PHARE ontology forms the foundation of a knowledge base named PHARE-KB. Once populated with relationships, PHARE-KB (i) can be visualized in the form of a biological network to guide human tasks such as database curation and (ii) can be queried programmatically to guide bioinformatics applications such as the prediction of molecular interactions. PHARE is available at http://purl.bioontology.org/ontology/PHARE.
format Text
id pubmed-3102890
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-31028902011-05-28 Integration and publication of heterogeneous text-mined relationships on the Semantic Web Coulet, Adrien Garten, Yael Dumontier, Michel Altman, Russ B Musen, Mark A Shah, Nigam H J Biomed Semantics Proceedings BACKGROUND: Advances in Natural Language Processing (NLP) techniques enable the extraction of fine-grained relationships mentioned in biomedical text. The variability and the complexity of natural language in expressing similar relationships causes the extracted relationships to be highly heterogeneous, which makes the construction of knowledge bases difficult and poses a challenge in using these for data mining or question answering. RESULTS: We report on the semi-automatic construction of the PHARE relationship ontology (the PHArmacogenomic RElationships Ontology) consisting of 200 curated relations from over 40,000 heterogeneous relationships extracted via text-mining. These heterogeneous relations are then mapped to the PHARE ontology using synonyms, entity descriptions and hierarchies of entities and roles. Once mapped, relationships can be normalized and compared using the structure of the ontology to identify relationships that have similar semantics but different syntax. We compare and contrast the manual procedure with a fully automated approach using WordNet to quantify the degree of integration enabled by iterative curation and refinement of the PHARE ontology. The result of such integration is a repository of normalized biomedical relationships, named PHARE-KB, which can be queried using Semantic Web technologies such as SPARQL and can be visualized in the form of a biological network. CONCLUSIONS: The PHARE ontology serves as a common semantic framework to integrate more than 40,000 relationships pertinent to pharmacogenomics. The PHARE ontology forms the foundation of a knowledge base named PHARE-KB. Once populated with relationships, PHARE-KB (i) can be visualized in the form of a biological network to guide human tasks such as database curation and (ii) can be queried programmatically to guide bioinformatics applications such as the prediction of molecular interactions. PHARE is available at http://purl.bioontology.org/ontology/PHARE. BioMed Central 2011-05-17 /pmc/articles/PMC3102890/ /pubmed/21624156 http://dx.doi.org/10.1186/2041-1480-2-S2-S10 Text en Copyright ©2011 Coulet et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Coulet, Adrien
Garten, Yael
Dumontier, Michel
Altman, Russ B
Musen, Mark A
Shah, Nigam H
Integration and publication of heterogeneous text-mined relationships on the Semantic Web
title Integration and publication of heterogeneous text-mined relationships on the Semantic Web
title_full Integration and publication of heterogeneous text-mined relationships on the Semantic Web
title_fullStr Integration and publication of heterogeneous text-mined relationships on the Semantic Web
title_full_unstemmed Integration and publication of heterogeneous text-mined relationships on the Semantic Web
title_short Integration and publication of heterogeneous text-mined relationships on the Semantic Web
title_sort integration and publication of heterogeneous text-mined relationships on the semantic web
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3102890/
https://www.ncbi.nlm.nih.gov/pubmed/21624156
http://dx.doi.org/10.1186/2041-1480-2-S2-S10
work_keys_str_mv AT couletadrien integrationandpublicationofheterogeneoustextminedrelationshipsonthesemanticweb
AT gartenyael integrationandpublicationofheterogeneoustextminedrelationshipsonthesemanticweb
AT dumontiermichel integrationandpublicationofheterogeneoustextminedrelationshipsonthesemanticweb
AT altmanrussb integrationandpublicationofheterogeneoustextminedrelationshipsonthesemanticweb
AT musenmarka integrationandpublicationofheterogeneoustextminedrelationshipsonthesemanticweb
AT shahnigamh integrationandpublicationofheterogeneoustextminedrelationshipsonthesemanticweb