Cargando…

Illuminating the druggable genome through patent bioactivity data

The patent literature is a potentially valuable source of bioactivity data. In this article we describe a process to prioritise 3.7 million life science relevant patents obtained from the SureChEMBL database (https://www.surechembl.org/), according to how likely they were to contain bioactivity data...

Descripción completa

Detalles Bibliográficos
Autores principales: Magariños, Maria P., Gaulton, Anna, Félix, Eloy, Kiziloren, Tevfik, Arcila, Ricardo, Oprea, Tudor I., Leach, Andrew R.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10162037/
https://www.ncbi.nlm.nih.gov/pubmed/37151295
http://dx.doi.org/10.7717/peerj.15153
_version_ 1785037620384890880
author Magariños, Maria P.
Gaulton, Anna
Félix, Eloy
Kiziloren, Tevfik
Arcila, Ricardo
Oprea, Tudor I.
Leach, Andrew R.
author_facet Magariños, Maria P.
Gaulton, Anna
Félix, Eloy
Kiziloren, Tevfik
Arcila, Ricardo
Oprea, Tudor I.
Leach, Andrew R.
author_sort Magariños, Maria P.
collection PubMed
description The patent literature is a potentially valuable source of bioactivity data. In this article we describe a process to prioritise 3.7 million life science relevant patents obtained from the SureChEMBL database (https://www.surechembl.org/), according to how likely they were to contain bioactivity data for potent small molecules on less-studied targets, based on the classification developed by the Illuminating the Druggable Genome (IDG) project. The overall goal was to select a smaller number of patents that could be manually curated and incorporated into the ChEMBL database. Using relatively simple annotation and filtering pipelines, we have been able to identify a substantial number of patents containing quantitative bioactivity data for understudied targets that had not previously been reported in the peer-reviewed medicinal chemistry literature. We quantify the added value of such methods in terms of the numbers of targets that are so identified, and provide some specific illustrative examples. Our work underlines the potential value in searching the patent corpus in addition to the more traditional peer-reviewed literature. The small molecules found in these patents, together with their measured activity against the targets, are now accessible via the ChEMBL database.
format Online
Article
Text
id pubmed-10162037
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-101620372023-05-06 Illuminating the druggable genome through patent bioactivity data Magariños, Maria P. Gaulton, Anna Félix, Eloy Kiziloren, Tevfik Arcila, Ricardo Oprea, Tudor I. Leach, Andrew R. PeerJ Biochemistry The patent literature is a potentially valuable source of bioactivity data. In this article we describe a process to prioritise 3.7 million life science relevant patents obtained from the SureChEMBL database (https://www.surechembl.org/), according to how likely they were to contain bioactivity data for potent small molecules on less-studied targets, based on the classification developed by the Illuminating the Druggable Genome (IDG) project. The overall goal was to select a smaller number of patents that could be manually curated and incorporated into the ChEMBL database. Using relatively simple annotation and filtering pipelines, we have been able to identify a substantial number of patents containing quantitative bioactivity data for understudied targets that had not previously been reported in the peer-reviewed medicinal chemistry literature. We quantify the added value of such methods in terms of the numbers of targets that are so identified, and provide some specific illustrative examples. Our work underlines the potential value in searching the patent corpus in addition to the more traditional peer-reviewed literature. The small molecules found in these patents, together with their measured activity against the targets, are now accessible via the ChEMBL database. PeerJ Inc. 2023-05-02 /pmc/articles/PMC10162037/ /pubmed/37151295 http://dx.doi.org/10.7717/peerj.15153 Text en © 2023 Magariños et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Biochemistry
Magariños, Maria P.
Gaulton, Anna
Félix, Eloy
Kiziloren, Tevfik
Arcila, Ricardo
Oprea, Tudor I.
Leach, Andrew R.
Illuminating the druggable genome through patent bioactivity data
title Illuminating the druggable genome through patent bioactivity data
title_full Illuminating the druggable genome through patent bioactivity data
title_fullStr Illuminating the druggable genome through patent bioactivity data
title_full_unstemmed Illuminating the druggable genome through patent bioactivity data
title_short Illuminating the druggable genome through patent bioactivity data
title_sort illuminating the druggable genome through patent bioactivity data
topic Biochemistry
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10162037/
https://www.ncbi.nlm.nih.gov/pubmed/37151295
http://dx.doi.org/10.7717/peerj.15153
work_keys_str_mv AT magarinosmariap illuminatingthedruggablegenomethroughpatentbioactivitydata
AT gaultonanna illuminatingthedruggablegenomethroughpatentbioactivitydata
AT felixeloy illuminatingthedruggablegenomethroughpatentbioactivitydata
AT kizilorentevfik illuminatingthedruggablegenomethroughpatentbioactivitydata
AT arcilaricardo illuminatingthedruggablegenomethroughpatentbioactivitydata
AT opreatudori illuminatingthedruggablegenomethroughpatentbioactivitydata
AT leachandrewr illuminatingthedruggablegenomethroughpatentbioactivitydata