Cargando…
Building a large gene expression-cancer knowledge base with limited human annotations
Cancer prevention is one of the most pressing challenges that public health needs to face. In this regard, data-driven research is central to assist medical solutions targeting cancer. To fully harness the power of data-driven research, it is imperative to have well-organized machine-readable facts...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10533344/ https://www.ncbi.nlm.nih.gov/pubmed/37768281 http://dx.doi.org/10.1093/database/baad061 |
_version_ | 1785112170675044352 |
---|---|
author | Marchesin, Stefano Menotti, Laura Giachelle, Fabio Silvello, Gianmaria Alonso, Omar |
author_facet | Marchesin, Stefano Menotti, Laura Giachelle, Fabio Silvello, Gianmaria Alonso, Omar |
author_sort | Marchesin, Stefano |
collection | PubMed |
description | Cancer prevention is one of the most pressing challenges that public health needs to face. In this regard, data-driven research is central to assist medical solutions targeting cancer. To fully harness the power of data-driven research, it is imperative to have well-organized machine-readable facts into a knowledge base (KB). Motivated by this urgent need, we introduce the Collaborative Oriented Relation Extraction (CORE) system for building KBs with limited manual annotations. CORE is based on the combination of distant supervision and active learning paradigms and offers a seamless, transparent, modular architecture equipped for large-scale processing. We focus on precision medicine and build the largest KB on ‘fine-grained’ gene expression–cancer associations—a key to complement and validate experimental data for cancer research. We show the robustness of CORE and discuss the usefulness of the provided KB. Database URL https://zenodo.org/record/7577127 |
format | Online Article Text |
id | pubmed-10533344 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-105333442023-09-28 Building a large gene expression-cancer knowledge base with limited human annotations Marchesin, Stefano Menotti, Laura Giachelle, Fabio Silvello, Gianmaria Alonso, Omar Database (Oxford) Original Article Cancer prevention is one of the most pressing challenges that public health needs to face. In this regard, data-driven research is central to assist medical solutions targeting cancer. To fully harness the power of data-driven research, it is imperative to have well-organized machine-readable facts into a knowledge base (KB). Motivated by this urgent need, we introduce the Collaborative Oriented Relation Extraction (CORE) system for building KBs with limited manual annotations. CORE is based on the combination of distant supervision and active learning paradigms and offers a seamless, transparent, modular architecture equipped for large-scale processing. We focus on precision medicine and build the largest KB on ‘fine-grained’ gene expression–cancer associations—a key to complement and validate experimental data for cancer research. We show the robustness of CORE and discuss the usefulness of the provided KB. Database URL https://zenodo.org/record/7577127 Oxford University Press 2023-09-27 /pmc/articles/PMC10533344/ /pubmed/37768281 http://dx.doi.org/10.1093/database/baad061 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Article Marchesin, Stefano Menotti, Laura Giachelle, Fabio Silvello, Gianmaria Alonso, Omar Building a large gene expression-cancer knowledge base with limited human annotations |
title | Building a large gene expression-cancer knowledge base with limited human annotations |
title_full | Building a large gene expression-cancer knowledge base with limited human annotations |
title_fullStr | Building a large gene expression-cancer knowledge base with limited human annotations |
title_full_unstemmed | Building a large gene expression-cancer knowledge base with limited human annotations |
title_short | Building a large gene expression-cancer knowledge base with limited human annotations |
title_sort | building a large gene expression-cancer knowledge base with limited human annotations |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10533344/ https://www.ncbi.nlm.nih.gov/pubmed/37768281 http://dx.doi.org/10.1093/database/baad061 |
work_keys_str_mv | AT marchesinstefano buildingalargegeneexpressioncancerknowledgebasewithlimitedhumanannotations AT menottilaura buildingalargegeneexpressioncancerknowledgebasewithlimitedhumanannotations AT giachellefabio buildingalargegeneexpressioncancerknowledgebasewithlimitedhumanannotations AT silvellogianmaria buildingalargegeneexpressioncancerknowledgebasewithlimitedhumanannotations AT alonsoomar buildingalargegeneexpressioncancerknowledgebasewithlimitedhumanannotations |