Cargando…

Building a large gene expression-cancer knowledge base with limited human annotations

Cancer prevention is one of the most pressing challenges that public health needs to face. In this regard, data-driven research is central to assist medical solutions targeting cancer. To fully harness the power of data-driven research, it is imperative to have well-organized machine-readable facts...

Descripción completa

Detalles Bibliográficos
Autores principales: Marchesin, Stefano, Menotti, Laura, Giachelle, Fabio, Silvello, Gianmaria, Alonso, Omar
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10533344/
https://www.ncbi.nlm.nih.gov/pubmed/37768281
http://dx.doi.org/10.1093/database/baad061
_version_ 1785112170675044352
author Marchesin, Stefano
Menotti, Laura
Giachelle, Fabio
Silvello, Gianmaria
Alonso, Omar
author_facet Marchesin, Stefano
Menotti, Laura
Giachelle, Fabio
Silvello, Gianmaria
Alonso, Omar
author_sort Marchesin, Stefano
collection PubMed
description Cancer prevention is one of the most pressing challenges that public health needs to face. In this regard, data-driven research is central to assist medical solutions targeting cancer. To fully harness the power of data-driven research, it is imperative to have well-organized machine-readable facts into a knowledge base (KB). Motivated by this urgent need, we introduce the Collaborative Oriented Relation Extraction (CORE) system for building KBs with limited manual annotations. CORE is based on the combination of distant supervision and active learning paradigms and offers a seamless, transparent, modular architecture equipped for large-scale processing. We focus on precision medicine and build the largest KB on ‘fine-grained’ gene expression–cancer associations—a key to complement and validate experimental data for cancer research. We show the robustness of CORE and discuss the usefulness of the provided KB. Database URL https://zenodo.org/record/7577127
format Online
Article
Text
id pubmed-10533344
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-105333442023-09-28 Building a large gene expression-cancer knowledge base with limited human annotations Marchesin, Stefano Menotti, Laura Giachelle, Fabio Silvello, Gianmaria Alonso, Omar Database (Oxford) Original Article Cancer prevention is one of the most pressing challenges that public health needs to face. In this regard, data-driven research is central to assist medical solutions targeting cancer. To fully harness the power of data-driven research, it is imperative to have well-organized machine-readable facts into a knowledge base (KB). Motivated by this urgent need, we introduce the Collaborative Oriented Relation Extraction (CORE) system for building KBs with limited manual annotations. CORE is based on the combination of distant supervision and active learning paradigms and offers a seamless, transparent, modular architecture equipped for large-scale processing. We focus on precision medicine and build the largest KB on ‘fine-grained’ gene expression–cancer associations—a key to complement and validate experimental data for cancer research. We show the robustness of CORE and discuss the usefulness of the provided KB. Database URL https://zenodo.org/record/7577127 Oxford University Press 2023-09-27 /pmc/articles/PMC10533344/ /pubmed/37768281 http://dx.doi.org/10.1093/database/baad061 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Marchesin, Stefano
Menotti, Laura
Giachelle, Fabio
Silvello, Gianmaria
Alonso, Omar
Building a large gene expression-cancer knowledge base with limited human annotations
title Building a large gene expression-cancer knowledge base with limited human annotations
title_full Building a large gene expression-cancer knowledge base with limited human annotations
title_fullStr Building a large gene expression-cancer knowledge base with limited human annotations
title_full_unstemmed Building a large gene expression-cancer knowledge base with limited human annotations
title_short Building a large gene expression-cancer knowledge base with limited human annotations
title_sort building a large gene expression-cancer knowledge base with limited human annotations
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10533344/
https://www.ncbi.nlm.nih.gov/pubmed/37768281
http://dx.doi.org/10.1093/database/baad061
work_keys_str_mv AT marchesinstefano buildingalargegeneexpressioncancerknowledgebasewithlimitedhumanannotations
AT menottilaura buildingalargegeneexpressioncancerknowledgebasewithlimitedhumanannotations
AT giachellefabio buildingalargegeneexpressioncancerknowledgebasewithlimitedhumanannotations
AT silvellogianmaria buildingalargegeneexpressioncancerknowledgebasewithlimitedhumanannotations
AT alonsoomar buildingalargegeneexpressioncancerknowledgebasewithlimitedhumanannotations