Cargando…

Analysis of in vitro bioactivity data extracted from drug discovery literature and patents: Ranking 1654 human protein targets by assayed compounds and molecular scaffolds

BACKGROUND: Since the classic Hopkins and Groom druggable genome review in 2002, there have been a number of publications updating both the hypothetical and successful human drug target statistics. However, listings of research targets that define the area between these two extremes are sparse becau...

Descripción completa

Detalles Bibliográficos
Autores principales: Southan, Christopher, Boppana, Kiran, Jagarlapudi, Sarma ARP, Muresan, Sorel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3118229/
https://www.ncbi.nlm.nih.gov/pubmed/21569515
http://dx.doi.org/10.1186/1758-2946-3-14
_version_ 1782206442136141824
author Southan, Christopher
Boppana, Kiran
Jagarlapudi, Sarma ARP
Muresan, Sorel
author_facet Southan, Christopher
Boppana, Kiran
Jagarlapudi, Sarma ARP
Muresan, Sorel
author_sort Southan, Christopher
collection PubMed
description BACKGROUND: Since the classic Hopkins and Groom druggable genome review in 2002, there have been a number of publications updating both the hypothetical and successful human drug target statistics. However, listings of research targets that define the area between these two extremes are sparse because of the challenges of collating published information at the necessary scale. We have addressed this by interrogating databases, populated by expert curation, of bioactivity data extracted from patents and journal papers over the last 30 years. RESULTS: From a subset of just over 27,000 documents we have extracted a set of compound-to-target relationships for biochemical in vitro binding-type assay data for 1,736 human proteins and 1,654 gene identifiers. These are linked to 1,671,951 compound records derived from 823,179 unique chemical structures. The distribution showed a compounds-per-target average of 964 with a maximum of 42,869 (Factor Xa). The list includes non-targets, failed targets and cross-screening targets. The top-278 most actively pursued targets cover 90% of the compounds. We further investigated target ranking by determining the number of molecular frameworks and scaffolds. These were compared to the compound counts as alternative measures of chemical diversity on a per-target basis. CONCLUSIONS: The compounds-per-protein listing generated in this work (provided as a supplementary file) represents the major proportion of the human drug target landscape defined by published data. We supplemented the simple ranking by the number of compounds assayed with additional rankings by molecular topology. These showed significant differences and provide complementary assessments of chemical tractability.
format Online
Article
Text
id pubmed-3118229
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-31182292011-06-19 Analysis of in vitro bioactivity data extracted from drug discovery literature and patents: Ranking 1654 human protein targets by assayed compounds and molecular scaffolds Southan, Christopher Boppana, Kiran Jagarlapudi, Sarma ARP Muresan, Sorel J Cheminform Research Article BACKGROUND: Since the classic Hopkins and Groom druggable genome review in 2002, there have been a number of publications updating both the hypothetical and successful human drug target statistics. However, listings of research targets that define the area between these two extremes are sparse because of the challenges of collating published information at the necessary scale. We have addressed this by interrogating databases, populated by expert curation, of bioactivity data extracted from patents and journal papers over the last 30 years. RESULTS: From a subset of just over 27,000 documents we have extracted a set of compound-to-target relationships for biochemical in vitro binding-type assay data for 1,736 human proteins and 1,654 gene identifiers. These are linked to 1,671,951 compound records derived from 823,179 unique chemical structures. The distribution showed a compounds-per-target average of 964 with a maximum of 42,869 (Factor Xa). The list includes non-targets, failed targets and cross-screening targets. The top-278 most actively pursued targets cover 90% of the compounds. We further investigated target ranking by determining the number of molecular frameworks and scaffolds. These were compared to the compound counts as alternative measures of chemical diversity on a per-target basis. CONCLUSIONS: The compounds-per-protein listing generated in this work (provided as a supplementary file) represents the major proportion of the human drug target landscape defined by published data. We supplemented the simple ranking by the number of compounds assayed with additional rankings by molecular topology. These showed significant differences and provide complementary assessments of chemical tractability. BioMed Central 2011-05-13 /pmc/articles/PMC3118229/ /pubmed/21569515 http://dx.doi.org/10.1186/1758-2946-3-14 Text en Copyright ©2011 Southan et al; licensee Chemistry Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Southan, Christopher
Boppana, Kiran
Jagarlapudi, Sarma ARP
Muresan, Sorel
Analysis of in vitro bioactivity data extracted from drug discovery literature and patents: Ranking 1654 human protein targets by assayed compounds and molecular scaffolds
title Analysis of in vitro bioactivity data extracted from drug discovery literature and patents: Ranking 1654 human protein targets by assayed compounds and molecular scaffolds
title_full Analysis of in vitro bioactivity data extracted from drug discovery literature and patents: Ranking 1654 human protein targets by assayed compounds and molecular scaffolds
title_fullStr Analysis of in vitro bioactivity data extracted from drug discovery literature and patents: Ranking 1654 human protein targets by assayed compounds and molecular scaffolds
title_full_unstemmed Analysis of in vitro bioactivity data extracted from drug discovery literature and patents: Ranking 1654 human protein targets by assayed compounds and molecular scaffolds
title_short Analysis of in vitro bioactivity data extracted from drug discovery literature and patents: Ranking 1654 human protein targets by assayed compounds and molecular scaffolds
title_sort analysis of in vitro bioactivity data extracted from drug discovery literature and patents: ranking 1654 human protein targets by assayed compounds and molecular scaffolds
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3118229/
https://www.ncbi.nlm.nih.gov/pubmed/21569515
http://dx.doi.org/10.1186/1758-2946-3-14
work_keys_str_mv AT southanchristopher analysisofinvitrobioactivitydataextractedfromdrugdiscoveryliteratureandpatentsranking1654humanproteintargetsbyassayedcompoundsandmolecularscaffolds
AT boppanakiran analysisofinvitrobioactivitydataextractedfromdrugdiscoveryliteratureandpatentsranking1654humanproteintargetsbyassayedcompoundsandmolecularscaffolds
AT jagarlapudisarmaarp analysisofinvitrobioactivitydataextractedfromdrugdiscoveryliteratureandpatentsranking1654humanproteintargetsbyassayedcompoundsandmolecularscaffolds
AT muresansorel analysisofinvitrobioactivitydataextractedfromdrugdiscoveryliteratureandpatentsranking1654humanproteintargetsbyassayedcompoundsandmolecularscaffolds