Cargando…

Semantic Deep Learning: Prior Knowledge and a Type of Four-Term Embedding Analogy to Acquire Treatments for Well-Known Diseases

BACKGROUND: How to treat a disease remains to be the most common type of clinical question. Obtaining evidence-based answers from biomedical literature is difficult. Analogical reasoning with embeddings from deep learning (embedding analogies) may extract such biomedical facts, although the state-of...

Descripción completa

Detalles Bibliográficos
Autores principales: Arguello Casteleiro, Mercedes, Des Diz, Julio, Maroto, Nava, Fernandez Prieto, Maria Jesus, Peters, Simon, Wroe, Chris, Sevillano Torrado, Carlos, Maseda Fernandez, Diego, Stevens, Robert
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7441383/
https://www.ncbi.nlm.nih.gov/pubmed/32759099
http://dx.doi.org/10.2196/16948
_version_ 1783573286142083072
author Arguello Casteleiro, Mercedes
Des Diz, Julio
Maroto, Nava
Fernandez Prieto, Maria Jesus
Peters, Simon
Wroe, Chris
Sevillano Torrado, Carlos
Maseda Fernandez, Diego
Stevens, Robert
author_facet Arguello Casteleiro, Mercedes
Des Diz, Julio
Maroto, Nava
Fernandez Prieto, Maria Jesus
Peters, Simon
Wroe, Chris
Sevillano Torrado, Carlos
Maseda Fernandez, Diego
Stevens, Robert
author_sort Arguello Casteleiro, Mercedes
collection PubMed
description BACKGROUND: How to treat a disease remains to be the most common type of clinical question. Obtaining evidence-based answers from biomedical literature is difficult. Analogical reasoning with embeddings from deep learning (embedding analogies) may extract such biomedical facts, although the state-of-the-art focuses on pair-based proportional (pairwise) analogies such as man:woman::king:queen (“queen = −man +king +woman”). OBJECTIVE: This study aimed to systematically extract disease treatment statements with a Semantic Deep Learning (SemDeep) approach underpinned by prior knowledge and another type of 4-term analogy (other than pairwise). METHODS: As preliminaries, we investigated Continuous Bag-of-Words (CBOW) embedding analogies in a common-English corpus with five lines of text and observed a type of 4-term analogy (not pairwise) applying the 3CosAdd formula and relating the semantic fields person and death: “dagger = −Romeo +die +died” (search query: −Romeo +die +died). Our SemDeep approach worked with pre-existing items of knowledge (what is known) to make inferences sanctioned by a 4-term analogy (search query −x +z1 +z2) from CBOW and Skip-gram embeddings created with a PubMed systematic reviews subset (PMSB dataset). Stage1: Knowledge acquisition. Obtaining a set of terms, candidate y, from embeddings using vector arithmetic. Some n-gram pairs from the cosine and validated with evidence (prior knowledge) are the input for the 3cosAdd, seeking a type of 4-term analogy relating the semantic fields disease and treatment. Stage 2: Knowledge organization. Identification of candidates sanctioned by the analogy belonging to the semantic field treatment and mapping these candidates to unified medical language system Metathesaurus concepts with MetaMap. A concept pair is a brief disease treatment statement (biomedical fact). Stage 3: Knowledge validation. An evidence-based evaluation followed by human validation of biomedical facts potentially useful for clinicians. RESULTS: We obtained 5352 n-gram pairs from 446 search queries by applying the 3CosAdd. The microaveraging performance of MetaMap for candidate y belonging to the semantic field treatment was F-measure=80.00% (precision=77.00%, recall=83.25%). We developed an empirical heuristic with some predictive power for clinical winners, that is, search queries bringing candidate y with evidence of a therapeutic intent for target disease x. The search queries -asthma +inhaled_corticosteroids +inhaled_corticosteroid and -epilepsy +valproate +antiepileptic_drug were clinical winners, finding eight evidence-based beneficial treatments. CONCLUSIONS: Extracting treatments with therapeutic intent by analogical reasoning from embeddings (423K n-grams from the PMSB dataset) is an ambitious goal. Our SemDeep approach is knowledge-based, underpinned by embedding analogies that exploit prior knowledge. Biomedical facts from embedding analogies (4-term type, not pairwise) are potentially useful for clinicians. The heuristic offers a practical way to discover beneficial treatments for well-known diseases. Learning from deep learning models does not require a massive amount of data. Embedding analogies are not limited to pairwise analogies; hence, analogical reasoning with embeddings is underexploited.
format Online
Article
Text
id pubmed-7441383
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-74413832020-08-31 Semantic Deep Learning: Prior Knowledge and a Type of Four-Term Embedding Analogy to Acquire Treatments for Well-Known Diseases Arguello Casteleiro, Mercedes Des Diz, Julio Maroto, Nava Fernandez Prieto, Maria Jesus Peters, Simon Wroe, Chris Sevillano Torrado, Carlos Maseda Fernandez, Diego Stevens, Robert JMIR Med Inform Original Paper BACKGROUND: How to treat a disease remains to be the most common type of clinical question. Obtaining evidence-based answers from biomedical literature is difficult. Analogical reasoning with embeddings from deep learning (embedding analogies) may extract such biomedical facts, although the state-of-the-art focuses on pair-based proportional (pairwise) analogies such as man:woman::king:queen (“queen = −man +king +woman”). OBJECTIVE: This study aimed to systematically extract disease treatment statements with a Semantic Deep Learning (SemDeep) approach underpinned by prior knowledge and another type of 4-term analogy (other than pairwise). METHODS: As preliminaries, we investigated Continuous Bag-of-Words (CBOW) embedding analogies in a common-English corpus with five lines of text and observed a type of 4-term analogy (not pairwise) applying the 3CosAdd formula and relating the semantic fields person and death: “dagger = −Romeo +die +died” (search query: −Romeo +die +died). Our SemDeep approach worked with pre-existing items of knowledge (what is known) to make inferences sanctioned by a 4-term analogy (search query −x +z1 +z2) from CBOW and Skip-gram embeddings created with a PubMed systematic reviews subset (PMSB dataset). Stage1: Knowledge acquisition. Obtaining a set of terms, candidate y, from embeddings using vector arithmetic. Some n-gram pairs from the cosine and validated with evidence (prior knowledge) are the input for the 3cosAdd, seeking a type of 4-term analogy relating the semantic fields disease and treatment. Stage 2: Knowledge organization. Identification of candidates sanctioned by the analogy belonging to the semantic field treatment and mapping these candidates to unified medical language system Metathesaurus concepts with MetaMap. A concept pair is a brief disease treatment statement (biomedical fact). Stage 3: Knowledge validation. An evidence-based evaluation followed by human validation of biomedical facts potentially useful for clinicians. RESULTS: We obtained 5352 n-gram pairs from 446 search queries by applying the 3CosAdd. The microaveraging performance of MetaMap for candidate y belonging to the semantic field treatment was F-measure=80.00% (precision=77.00%, recall=83.25%). We developed an empirical heuristic with some predictive power for clinical winners, that is, search queries bringing candidate y with evidence of a therapeutic intent for target disease x. The search queries -asthma +inhaled_corticosteroids +inhaled_corticosteroid and -epilepsy +valproate +antiepileptic_drug were clinical winners, finding eight evidence-based beneficial treatments. CONCLUSIONS: Extracting treatments with therapeutic intent by analogical reasoning from embeddings (423K n-grams from the PMSB dataset) is an ambitious goal. Our SemDeep approach is knowledge-based, underpinned by embedding analogies that exploit prior knowledge. Biomedical facts from embedding analogies (4-term type, not pairwise) are potentially useful for clinicians. The heuristic offers a practical way to discover beneficial treatments for well-known diseases. Learning from deep learning models does not require a massive amount of data. Embedding analogies are not limited to pairwise analogies; hence, analogical reasoning with embeddings is underexploited. JMIR Publications 2020-08-06 /pmc/articles/PMC7441383/ /pubmed/32759099 http://dx.doi.org/10.2196/16948 Text en ©Mercedes Arguello Casteleiro, Julio Des Diz, Nava Maroto, Maria Jesus Fernandez Prieto, Simon Peters, Chris Wroe, Carlos Sevillano Torrado, Diego Maseda Fernandez, Robert Stevens. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 06.08.2020. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Arguello Casteleiro, Mercedes
Des Diz, Julio
Maroto, Nava
Fernandez Prieto, Maria Jesus
Peters, Simon
Wroe, Chris
Sevillano Torrado, Carlos
Maseda Fernandez, Diego
Stevens, Robert
Semantic Deep Learning: Prior Knowledge and a Type of Four-Term Embedding Analogy to Acquire Treatments for Well-Known Diseases
title Semantic Deep Learning: Prior Knowledge and a Type of Four-Term Embedding Analogy to Acquire Treatments for Well-Known Diseases
title_full Semantic Deep Learning: Prior Knowledge and a Type of Four-Term Embedding Analogy to Acquire Treatments for Well-Known Diseases
title_fullStr Semantic Deep Learning: Prior Knowledge and a Type of Four-Term Embedding Analogy to Acquire Treatments for Well-Known Diseases
title_full_unstemmed Semantic Deep Learning: Prior Knowledge and a Type of Four-Term Embedding Analogy to Acquire Treatments for Well-Known Diseases
title_short Semantic Deep Learning: Prior Knowledge and a Type of Four-Term Embedding Analogy to Acquire Treatments for Well-Known Diseases
title_sort semantic deep learning: prior knowledge and a type of four-term embedding analogy to acquire treatments for well-known diseases
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7441383/
https://www.ncbi.nlm.nih.gov/pubmed/32759099
http://dx.doi.org/10.2196/16948
work_keys_str_mv AT arguellocasteleiromercedes semanticdeeplearningpriorknowledgeandatypeoffourtermembeddinganalogytoacquiretreatmentsforwellknowndiseases
AT desdizjulio semanticdeeplearningpriorknowledgeandatypeoffourtermembeddinganalogytoacquiretreatmentsforwellknowndiseases
AT marotonava semanticdeeplearningpriorknowledgeandatypeoffourtermembeddinganalogytoacquiretreatmentsforwellknowndiseases
AT fernandezprietomariajesus semanticdeeplearningpriorknowledgeandatypeoffourtermembeddinganalogytoacquiretreatmentsforwellknowndiseases
AT peterssimon semanticdeeplearningpriorknowledgeandatypeoffourtermembeddinganalogytoacquiretreatmentsforwellknowndiseases
AT wroechris semanticdeeplearningpriorknowledgeandatypeoffourtermembeddinganalogytoacquiretreatmentsforwellknowndiseases
AT sevillanotorradocarlos semanticdeeplearningpriorknowledgeandatypeoffourtermembeddinganalogytoacquiretreatmentsforwellknowndiseases
AT masedafernandezdiego semanticdeeplearningpriorknowledgeandatypeoffourtermembeddinganalogytoacquiretreatmentsforwellknowndiseases
AT stevensrobert semanticdeeplearningpriorknowledgeandatypeoffourtermembeddinganalogytoacquiretreatmentsforwellknowndiseases