Cargando…

SIENA: Semi-automatic semantic enhancement of datasets using concept recognition

BACKGROUND: The amount of available data, which can facilitate answering scientific research questions, is growing. However, the different formats of published data are expanding as well, creating a serious challenge when multiple datasets need to be integrated for answering a question. RESULTS: Thi...

Descripción completa

Detalles Bibliográficos
Autores principales: Grigoriu, Andreea, Zaveri, Amrapali, Weiss, Gerhard, Dumontier, Michel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7992819/
https://www.ncbi.nlm.nih.gov/pubmed/33761996
http://dx.doi.org/10.1186/s13326-021-00239-z
Descripción
Sumario:BACKGROUND: The amount of available data, which can facilitate answering scientific research questions, is growing. However, the different formats of published data are expanding as well, creating a serious challenge when multiple datasets need to be integrated for answering a question. RESULTS: This paper presents a semi-automated framework that provides semantic enhancement of biomedical data, specifically gene datasets. The framework involved a concept recognition task using machine learning, in combination with the BioPortal annotator. Compared to using methods which require only the BioPortal annotator for semantic enhancement, the proposed framework achieves the highest results. CONCLUSIONS: Using concept recognition combined with machine learning techniques and annotation with a biomedical ontology, the proposed framework can provide datasets to reach their full potential of providing meaningful information, which can answer scientific research questions.