Cargando…

Semantic text mining in early drug discovery for type 2 diabetes

BACKGROUND: Surveying the scientific literature is an important part of early drug discovery; and with the ever-increasing amount of biomedical publications it is imperative to focus on the most interesting articles. Here we present a project that highlights new understanding (e.g. recently discover...

Descripción completa

Detalles Bibliográficos
Autores principales: Hansson, Lena K., Hansen, Rasmus Borup, Pletscher-Frankild, Sune, Berzins, Rudolfs, Hansen, Daniel Hvidberg, Madsen, Dennis, Christensen, Sten B., Christiansen, Malene Revsbech, Boulund, Ulrika, Wolf, Xenia Asbæk, Kjærulff, Sonny Kim, van de Bunt, Martijn, Tulin, Søren, Jensen, Thomas Skøt, Wernersson, Rasmus, Jensen, Jan Nygaard
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7295186/
https://www.ncbi.nlm.nih.gov/pubmed/32542027
http://dx.doi.org/10.1371/journal.pone.0233956
_version_ 1783546603974426624
author Hansson, Lena K.
Hansen, Rasmus Borup
Pletscher-Frankild, Sune
Berzins, Rudolfs
Hansen, Daniel Hvidberg
Madsen, Dennis
Christensen, Sten B.
Christiansen, Malene Revsbech
Boulund, Ulrika
Wolf, Xenia Asbæk
Kjærulff, Sonny Kim
van de Bunt, Martijn
Tulin, Søren
Jensen, Thomas Skøt
Wernersson, Rasmus
Jensen, Jan Nygaard
author_facet Hansson, Lena K.
Hansen, Rasmus Borup
Pletscher-Frankild, Sune
Berzins, Rudolfs
Hansen, Daniel Hvidberg
Madsen, Dennis
Christensen, Sten B.
Christiansen, Malene Revsbech
Boulund, Ulrika
Wolf, Xenia Asbæk
Kjærulff, Sonny Kim
van de Bunt, Martijn
Tulin, Søren
Jensen, Thomas Skøt
Wernersson, Rasmus
Jensen, Jan Nygaard
author_sort Hansson, Lena K.
collection PubMed
description BACKGROUND: Surveying the scientific literature is an important part of early drug discovery; and with the ever-increasing amount of biomedical publications it is imperative to focus on the most interesting articles. Here we present a project that highlights new understanding (e.g. recently discovered modes of action) and identifies potential drug targets, via a novel, data-driven text mining approach to score type 2 diabetes (T2D) relevance. We focused on monitoring trends and jumps in T2D relevance to help us be timely informed of important breakthroughs. METHODS: We extracted over 7 million n-grams from PubMed abstracts and then clustered around 240,000 linked to T2D into almost 50,000 T2D relevant ‘semantic concepts’. To score papers, we weighted the concepts based on co-mentioning with core T2D proteins. A protein’s T2D relevance was determined by combining the scores of the papers mentioning it in the five preceding years. Each week all proteins were ranked according to their T2D relevance. Furthermore, the historical distribution of changes in rank from one week to the next was used to calculate the significance of a change in rank by T2D relevance for each protein. RESULTS: We show that T2D relevant papers, even those not mentioning T2D explicitly, were prioritised by relevant semantic concepts. Well known T2D proteins were therefore enriched among the top scoring proteins. Our ‘high jumpers’ identified important past developments in the apprehension of how certain key proteins relate to T2D, indicating that our method will make us aware of future breakthroughs. In summary, this project facilitated keeping up with current T2D research by repeatedly providing short lists of potential novel targets into our early drug discovery pipeline.
format Online
Article
Text
id pubmed-7295186
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-72951862020-06-19 Semantic text mining in early drug discovery for type 2 diabetes Hansson, Lena K. Hansen, Rasmus Borup Pletscher-Frankild, Sune Berzins, Rudolfs Hansen, Daniel Hvidberg Madsen, Dennis Christensen, Sten B. Christiansen, Malene Revsbech Boulund, Ulrika Wolf, Xenia Asbæk Kjærulff, Sonny Kim van de Bunt, Martijn Tulin, Søren Jensen, Thomas Skøt Wernersson, Rasmus Jensen, Jan Nygaard PLoS One Research Article BACKGROUND: Surveying the scientific literature is an important part of early drug discovery; and with the ever-increasing amount of biomedical publications it is imperative to focus on the most interesting articles. Here we present a project that highlights new understanding (e.g. recently discovered modes of action) and identifies potential drug targets, via a novel, data-driven text mining approach to score type 2 diabetes (T2D) relevance. We focused on monitoring trends and jumps in T2D relevance to help us be timely informed of important breakthroughs. METHODS: We extracted over 7 million n-grams from PubMed abstracts and then clustered around 240,000 linked to T2D into almost 50,000 T2D relevant ‘semantic concepts’. To score papers, we weighted the concepts based on co-mentioning with core T2D proteins. A protein’s T2D relevance was determined by combining the scores of the papers mentioning it in the five preceding years. Each week all proteins were ranked according to their T2D relevance. Furthermore, the historical distribution of changes in rank from one week to the next was used to calculate the significance of a change in rank by T2D relevance for each protein. RESULTS: We show that T2D relevant papers, even those not mentioning T2D explicitly, were prioritised by relevant semantic concepts. Well known T2D proteins were therefore enriched among the top scoring proteins. Our ‘high jumpers’ identified important past developments in the apprehension of how certain key proteins relate to T2D, indicating that our method will make us aware of future breakthroughs. In summary, this project facilitated keeping up with current T2D research by repeatedly providing short lists of potential novel targets into our early drug discovery pipeline. Public Library of Science 2020-06-15 /pmc/articles/PMC7295186/ /pubmed/32542027 http://dx.doi.org/10.1371/journal.pone.0233956 Text en © 2020 Hansson et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Hansson, Lena K.
Hansen, Rasmus Borup
Pletscher-Frankild, Sune
Berzins, Rudolfs
Hansen, Daniel Hvidberg
Madsen, Dennis
Christensen, Sten B.
Christiansen, Malene Revsbech
Boulund, Ulrika
Wolf, Xenia Asbæk
Kjærulff, Sonny Kim
van de Bunt, Martijn
Tulin, Søren
Jensen, Thomas Skøt
Wernersson, Rasmus
Jensen, Jan Nygaard
Semantic text mining in early drug discovery for type 2 diabetes
title Semantic text mining in early drug discovery for type 2 diabetes
title_full Semantic text mining in early drug discovery for type 2 diabetes
title_fullStr Semantic text mining in early drug discovery for type 2 diabetes
title_full_unstemmed Semantic text mining in early drug discovery for type 2 diabetes
title_short Semantic text mining in early drug discovery for type 2 diabetes
title_sort semantic text mining in early drug discovery for type 2 diabetes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7295186/
https://www.ncbi.nlm.nih.gov/pubmed/32542027
http://dx.doi.org/10.1371/journal.pone.0233956
work_keys_str_mv AT hanssonlenak semantictextmininginearlydrugdiscoveryfortype2diabetes
AT hansenrasmusborup semantictextmininginearlydrugdiscoveryfortype2diabetes
AT pletscherfrankildsune semantictextmininginearlydrugdiscoveryfortype2diabetes
AT berzinsrudolfs semantictextmininginearlydrugdiscoveryfortype2diabetes
AT hansendanielhvidberg semantictextmininginearlydrugdiscoveryfortype2diabetes
AT madsendennis semantictextmininginearlydrugdiscoveryfortype2diabetes
AT christensenstenb semantictextmininginearlydrugdiscoveryfortype2diabetes
AT christiansenmalenerevsbech semantictextmininginearlydrugdiscoveryfortype2diabetes
AT boulundulrika semantictextmininginearlydrugdiscoveryfortype2diabetes
AT wolfxeniaasbæk semantictextmininginearlydrugdiscoveryfortype2diabetes
AT kjærulffsonnykim semantictextmininginearlydrugdiscoveryfortype2diabetes
AT vandebuntmartijn semantictextmininginearlydrugdiscoveryfortype2diabetes
AT tulinsøren semantictextmininginearlydrugdiscoveryfortype2diabetes
AT jensenthomasskøt semantictextmininginearlydrugdiscoveryfortype2diabetes
AT wernerssonrasmus semantictextmininginearlydrugdiscoveryfortype2diabetes
AT jensenjannygaard semantictextmininginearlydrugdiscoveryfortype2diabetes