Cargando…

Turning Text into Research Networks: Information Retrieval and Computational Ontologies in the Creation of Scientific Databases

BACKGROUND: Web-based, free-text documents on science and technology have been increasing growing on the web. However, most of these documents are not immediately processable by computers slowing down the acquisition of useful information. Computational ontologies might represent a possible solution...

Descripción completa

Detalles Bibliográficos
Autores principales: Ceci, Flávio, Pietrobon, Ricardo, Gonçalves, Alexandre Leopoldo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3250392/
https://www.ncbi.nlm.nih.gov/pubmed/22235242
http://dx.doi.org/10.1371/journal.pone.0027499
_version_ 1782220455386546176
author Ceci, Flávio
Pietrobon, Ricardo
Gonçalves, Alexandre Leopoldo
author_facet Ceci, Flávio
Pietrobon, Ricardo
Gonçalves, Alexandre Leopoldo
author_sort Ceci, Flávio
collection PubMed
description BACKGROUND: Web-based, free-text documents on science and technology have been increasing growing on the web. However, most of these documents are not immediately processable by computers slowing down the acquisition of useful information. Computational ontologies might represent a possible solution by enabling semantically machine readable data sets. But, the process of ontology creation, instantiation and maintenance is still based on manual methodologies and thus time and cost intensive. METHOD: We focused on a large corpus containing information on researchers, research fields, and institutions. We based our strategy on traditional entity recognition, social computing and correlation. We devised a semi automatic approach for the recognition, correlation and extraction of named entities and relations from textual documents which are then used to create, instantiate, and maintain an ontology. RESULTS: We present a prototype demonstrating the applicability of the proposed strategy, along with a case study describing how direct and indirect relations can be extracted from academic and professional activities registered in a database of curriculum vitae in free-text format. We present evidence that this system can identify entities to assist in the process of knowledge extraction and representation to support ontology maintenance. We also demonstrate the extraction of relationships among ontology classes and their instances. CONCLUSION: We have demonstrated that our system can be used for the conversion of research information in free text format into database with a semantic structure. Future studies should test this system using the growing number of free-text information available at the institutional and national levels.
format Online
Article
Text
id pubmed-3250392
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-32503922012-01-10 Turning Text into Research Networks: Information Retrieval and Computational Ontologies in the Creation of Scientific Databases Ceci, Flávio Pietrobon, Ricardo Gonçalves, Alexandre Leopoldo PLoS One Research Article BACKGROUND: Web-based, free-text documents on science and technology have been increasing growing on the web. However, most of these documents are not immediately processable by computers slowing down the acquisition of useful information. Computational ontologies might represent a possible solution by enabling semantically machine readable data sets. But, the process of ontology creation, instantiation and maintenance is still based on manual methodologies and thus time and cost intensive. METHOD: We focused on a large corpus containing information on researchers, research fields, and institutions. We based our strategy on traditional entity recognition, social computing and correlation. We devised a semi automatic approach for the recognition, correlation and extraction of named entities and relations from textual documents which are then used to create, instantiate, and maintain an ontology. RESULTS: We present a prototype demonstrating the applicability of the proposed strategy, along with a case study describing how direct and indirect relations can be extracted from academic and professional activities registered in a database of curriculum vitae in free-text format. We present evidence that this system can identify entities to assist in the process of knowledge extraction and representation to support ontology maintenance. We also demonstrate the extraction of relationships among ontology classes and their instances. CONCLUSION: We have demonstrated that our system can be used for the conversion of research information in free text format into database with a semantic structure. Future studies should test this system using the growing number of free-text information available at the institutional and national levels. Public Library of Science 2012-01-03 /pmc/articles/PMC3250392/ /pubmed/22235242 http://dx.doi.org/10.1371/journal.pone.0027499 Text en Ceci et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Ceci, Flávio
Pietrobon, Ricardo
Gonçalves, Alexandre Leopoldo
Turning Text into Research Networks: Information Retrieval and Computational Ontologies in the Creation of Scientific Databases
title Turning Text into Research Networks: Information Retrieval and Computational Ontologies in the Creation of Scientific Databases
title_full Turning Text into Research Networks: Information Retrieval and Computational Ontologies in the Creation of Scientific Databases
title_fullStr Turning Text into Research Networks: Information Retrieval and Computational Ontologies in the Creation of Scientific Databases
title_full_unstemmed Turning Text into Research Networks: Information Retrieval and Computational Ontologies in the Creation of Scientific Databases
title_short Turning Text into Research Networks: Information Retrieval and Computational Ontologies in the Creation of Scientific Databases
title_sort turning text into research networks: information retrieval and computational ontologies in the creation of scientific databases
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3250392/
https://www.ncbi.nlm.nih.gov/pubmed/22235242
http://dx.doi.org/10.1371/journal.pone.0027499
work_keys_str_mv AT ceciflavio turningtextintoresearchnetworksinformationretrievalandcomputationalontologiesinthecreationofscientificdatabases
AT pietrobonricardo turningtextintoresearchnetworksinformationretrievalandcomputationalontologiesinthecreationofscientificdatabases
AT goncalvesalexandreleopoldo turningtextintoresearchnetworksinformationretrievalandcomputationalontologiesinthecreationofscientificdatabases