Cargando…

Biotea: RDFizing PubMed Central in support for the paper as an interface to the Web of Data

BACKGROUND: The World Wide Web has become a dissemination platform for scientific and non-scientific publications. However, most of the information remains locked up in discrete documents that are not always interconnected or machine-readable. The connectivity tissue provided by RDF technology has n...

Descripción completa

Detalles Bibliográficos
Autores principales: Garcia Castro, L Jael, McLaughlin, C, Garcia, A
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3804025/
https://www.ncbi.nlm.nih.gov/pubmed/23734622
http://dx.doi.org/10.1186/2041-1480-4-S1-S5
Descripción
Sumario:BACKGROUND: The World Wide Web has become a dissemination platform for scientific and non-scientific publications. However, most of the information remains locked up in discrete documents that are not always interconnected or machine-readable. The connectivity tissue provided by RDF technology has not yet been widely used to support the generation of self-describing, machine-readable documents. RESULTS: In this paper, we present our approach to the generation of self-describing machine-readable scholarly documents. We understand the scientific document as an entry point and interface to the Web of Data. We have semantically processed the full-text, open-access subset of PubMed Central. Our RDF model and resulting dataset make extensive use of existing ontologies and semantic enrichment services. We expose our model, services, prototype, and datasets at http://biotea.idiginfo.org/ CONCLUSIONS: The semantic processing of biomedical literature presented in this paper embeds documents within the Web of Data and facilitates the execution of concept-based queries against the entire digital library. Our approach delivers a flexible and adaptable set of tools for metadata enrichment and semantic processing of biomedical documents. Our model delivers a semantically rich and highly interconnected dataset with self-describing content so that software can make effective use of it.