Cargando…

CBAG: Conditional biomedical abstract generation

Biomedical research papers often combine disjoint concepts in novel ways, such as when describing a newly discovered relationship between an understudied gene with an important disease. These concepts are often explicitly encoded as metadata keywords, such as the author-provided terms included with...

Descripción completa

Detalles Bibliográficos
Autores principales: Sybrandt, Justin, Safro, Ilya
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8259990/
https://www.ncbi.nlm.nih.gov/pubmed/34228754
http://dx.doi.org/10.1371/journal.pone.0253905
_version_ 1783718749871800320
author Sybrandt, Justin
Safro, Ilya
author_facet Sybrandt, Justin
Safro, Ilya
author_sort Sybrandt, Justin
collection PubMed
description Biomedical research papers often combine disjoint concepts in novel ways, such as when describing a newly discovered relationship between an understudied gene with an important disease. These concepts are often explicitly encoded as metadata keywords, such as the author-provided terms included with many documents in the MEDLINE database. While substantial recent work has addressed the problem of text generation in a more general context, applications, such as scientific writing assistants, or hypothesis generation systems, could benefit from the capacity to select the specific set of concepts that underpin a generated biomedical text. We propose a conditional language model following the transformer architecture. This model uses the “encoder stack” to encode concepts that a user wishes to discuss in the generated text. The “decoder stack” then follows the masked self-attention pattern to perform text generation, using both prior tokens as well as the encoded condition. We demonstrate that this approach provides significant control, while still producing reasonable biomedical text.
format Online
Article
Text
id pubmed-8259990
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-82599902021-07-19 CBAG: Conditional biomedical abstract generation Sybrandt, Justin Safro, Ilya PLoS One Research Article Biomedical research papers often combine disjoint concepts in novel ways, such as when describing a newly discovered relationship between an understudied gene with an important disease. These concepts are often explicitly encoded as metadata keywords, such as the author-provided terms included with many documents in the MEDLINE database. While substantial recent work has addressed the problem of text generation in a more general context, applications, such as scientific writing assistants, or hypothesis generation systems, could benefit from the capacity to select the specific set of concepts that underpin a generated biomedical text. We propose a conditional language model following the transformer architecture. This model uses the “encoder stack” to encode concepts that a user wishes to discuss in the generated text. The “decoder stack” then follows the masked self-attention pattern to perform text generation, using both prior tokens as well as the encoded condition. We demonstrate that this approach provides significant control, while still producing reasonable biomedical text. Public Library of Science 2021-07-06 /pmc/articles/PMC8259990/ /pubmed/34228754 http://dx.doi.org/10.1371/journal.pone.0253905 Text en © 2021 Sybrandt, Safro https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Sybrandt, Justin
Safro, Ilya
CBAG: Conditional biomedical abstract generation
title CBAG: Conditional biomedical abstract generation
title_full CBAG: Conditional biomedical abstract generation
title_fullStr CBAG: Conditional biomedical abstract generation
title_full_unstemmed CBAG: Conditional biomedical abstract generation
title_short CBAG: Conditional biomedical abstract generation
title_sort cbag: conditional biomedical abstract generation
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8259990/
https://www.ncbi.nlm.nih.gov/pubmed/34228754
http://dx.doi.org/10.1371/journal.pone.0253905
work_keys_str_mv AT sybrandtjustin cbagconditionalbiomedicalabstractgeneration
AT safroilya cbagconditionalbiomedicalabstractgeneration