Cargando…

Humans and machines in biomedical knowledge curation: hypertrophic cardiomyopathy molecular mechanisms’ representation

BACKGROUND: Biomedical knowledge is dispersed in scientific literature and is growing constantly. Curation is the extraction of knowledge from unstructured data into a computable form and could be done manually or automatically. Hypertrophic cardiomyopathy (HCM) is the most common inherited cardiac...

Descripción completa

Detalles Bibliográficos
Autores principales: Glavaški, Mila, Velicki, Lazar
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8487578/
https://www.ncbi.nlm.nih.gov/pubmed/34600580
http://dx.doi.org/10.1186/s13040-021-00279-2
_version_ 1784577986265088000
author Glavaški, Mila
Velicki, Lazar
author_facet Glavaški, Mila
Velicki, Lazar
author_sort Glavaški, Mila
collection PubMed
description BACKGROUND: Biomedical knowledge is dispersed in scientific literature and is growing constantly. Curation is the extraction of knowledge from unstructured data into a computable form and could be done manually or automatically. Hypertrophic cardiomyopathy (HCM) is the most common inherited cardiac disease, with genotype–phenotype associations still incompletely understood. We compared human- and machine-curated HCM molecular mechanisms’ models and examined the performance of different machine approaches for that task. RESULTS: We created six models representing HCM molecular mechanisms using different approaches and made them publicly available, analyzed them as networks, and tried to explain the models’ differences by the analysis of factors that affect the quality of machine-curated models (query constraints and reading systems’ performance). A result of this work is also the Interactive HCM map, the only publicly available knowledge resource dedicated to HCM. Sizes and topological parameters of the networks differed notably, and a low consensus was found in terms of centrality measures between networks. Consensus about the most important nodes was achieved only with respect to one element (calcium). Models with a reduced level of noise were generated and cooperatively working elements were detected. REACH and TRIPS reading systems showed much higher accuracy than Sparser, but at the cost of extraction performance. TRIPS proved to be the best single reading system for text segments about HCM, in terms of the compromise between accuracy and extraction performance. CONCLUSIONS: Different approaches in curation can produce models of the same disease with diverse characteristics, and they give rise to utterly different conclusions in subsequent analysis. The final purpose of the model should direct the choice of curation techniques. Manual curation represents the gold standard for information extraction in biomedical research and is most suitable when only high-quality elements for models are required. Automated curation provides more substance, but high level of noise is expected. Different curation strategies can reduce the level of human input needed. Biomedical knowledge would benefit overwhelmingly, especially as to its rapid growth, if computers were to be able to assist in analysis on a larger scale. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13040-021-00279-2.
format Online
Article
Text
id pubmed-8487578
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-84875782021-10-04 Humans and machines in biomedical knowledge curation: hypertrophic cardiomyopathy molecular mechanisms’ representation Glavaški, Mila Velicki, Lazar BioData Min Research BACKGROUND: Biomedical knowledge is dispersed in scientific literature and is growing constantly. Curation is the extraction of knowledge from unstructured data into a computable form and could be done manually or automatically. Hypertrophic cardiomyopathy (HCM) is the most common inherited cardiac disease, with genotype–phenotype associations still incompletely understood. We compared human- and machine-curated HCM molecular mechanisms’ models and examined the performance of different machine approaches for that task. RESULTS: We created six models representing HCM molecular mechanisms using different approaches and made them publicly available, analyzed them as networks, and tried to explain the models’ differences by the analysis of factors that affect the quality of machine-curated models (query constraints and reading systems’ performance). A result of this work is also the Interactive HCM map, the only publicly available knowledge resource dedicated to HCM. Sizes and topological parameters of the networks differed notably, and a low consensus was found in terms of centrality measures between networks. Consensus about the most important nodes was achieved only with respect to one element (calcium). Models with a reduced level of noise were generated and cooperatively working elements were detected. REACH and TRIPS reading systems showed much higher accuracy than Sparser, but at the cost of extraction performance. TRIPS proved to be the best single reading system for text segments about HCM, in terms of the compromise between accuracy and extraction performance. CONCLUSIONS: Different approaches in curation can produce models of the same disease with diverse characteristics, and they give rise to utterly different conclusions in subsequent analysis. The final purpose of the model should direct the choice of curation techniques. Manual curation represents the gold standard for information extraction in biomedical research and is most suitable when only high-quality elements for models are required. Automated curation provides more substance, but high level of noise is expected. Different curation strategies can reduce the level of human input needed. Biomedical knowledge would benefit overwhelmingly, especially as to its rapid growth, if computers were to be able to assist in analysis on a larger scale. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13040-021-00279-2. BioMed Central 2021-10-02 /pmc/articles/PMC8487578/ /pubmed/34600580 http://dx.doi.org/10.1186/s13040-021-00279-2 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Glavaški, Mila
Velicki, Lazar
Humans and machines in biomedical knowledge curation: hypertrophic cardiomyopathy molecular mechanisms’ representation
title Humans and machines in biomedical knowledge curation: hypertrophic cardiomyopathy molecular mechanisms’ representation
title_full Humans and machines in biomedical knowledge curation: hypertrophic cardiomyopathy molecular mechanisms’ representation
title_fullStr Humans and machines in biomedical knowledge curation: hypertrophic cardiomyopathy molecular mechanisms’ representation
title_full_unstemmed Humans and machines in biomedical knowledge curation: hypertrophic cardiomyopathy molecular mechanisms’ representation
title_short Humans and machines in biomedical knowledge curation: hypertrophic cardiomyopathy molecular mechanisms’ representation
title_sort humans and machines in biomedical knowledge curation: hypertrophic cardiomyopathy molecular mechanisms’ representation
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8487578/
https://www.ncbi.nlm.nih.gov/pubmed/34600580
http://dx.doi.org/10.1186/s13040-021-00279-2
work_keys_str_mv AT glavaskimila humansandmachinesinbiomedicalknowledgecurationhypertrophiccardiomyopathymolecularmechanismsrepresentation
AT velickilazar humansandmachinesinbiomedicalknowledgecurationhypertrophiccardiomyopathymolecularmechanismsrepresentation