Cargando…

In silico re-identification of properties of drug target proteins

BACKGROUND: Computational approaches in the identification of drug targets are expected to reduce time and effort in drug development. Advances in genomics and proteomics provide the opportunity to uncover properties of druggable genomes. Although several studies have been conducted for distinguishi...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Baeksoo, Jo, Jihoon, Han, Jonghyun, Park, Chungoo, Lee, Hyunju
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5471946/
https://www.ncbi.nlm.nih.gov/pubmed/28617227
http://dx.doi.org/10.1186/s12859-017-1639-3
_version_ 1783244051109117952
author Kim, Baeksoo
Jo, Jihoon
Han, Jonghyun
Park, Chungoo
Lee, Hyunju
author_facet Kim, Baeksoo
Jo, Jihoon
Han, Jonghyun
Park, Chungoo
Lee, Hyunju
author_sort Kim, Baeksoo
collection PubMed
description BACKGROUND: Computational approaches in the identification of drug targets are expected to reduce time and effort in drug development. Advances in genomics and proteomics provide the opportunity to uncover properties of druggable genomes. Although several studies have been conducted for distinguishing drug targets from non-drug targets, they mainly focus on the sequences and functional roles of proteins. Many other properties of proteins have not been fully investigated. METHODS: Using the DrugBank (version 3.0) database containing nearly 6,816 drug entries including 760 FDA-approved drugs and 1822 of their targets and human UniProt/Swiss-Prot databases, we defined 1578 non-redundant drug target and 17,575 non-drug target proteins. To select these non-redundant protein datasets, we built four datasets (A, B, C, and D) by considering clustering of paralogous proteins. RESULTS: We first reassessed the widely used properties of drug target proteins. We confirmed and extended that drug target proteins (1) are likely to have more hydrophobic, less polar, less PEST sequences, and more signal peptide sequences higher and (2) are more involved in enzyme catalysis, oxidation and reduction in cellular respiration, and operational genes. In this study, we proposed new properties (essentiality, expression pattern, PTMs, and solvent accessibility) for effectively identifying drug target proteins. We found that (1) drug targetability and protein essentiality are decoupled, (2) druggability of proteins has high expression level and tissue specificity, and (3) functional post-translational modification residues are enriched in drug target proteins. In addition, to predict the drug targetability of proteins, we exploited two machine learning methods (Support Vector Machine and Random Forest). When we predicted drug targets by combining previously known protein properties and proposed new properties, an F-score of 0.8307 was obtained. CONCLUSIONS: When the newly proposed properties are integrated, the prediction performance is improved and these properties are related to drug targets. We believe that our study will provide a new aspect in inferring drug-target interactions. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1639-3) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5471946
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-54719462017-06-19 In silico re-identification of properties of drug target proteins Kim, Baeksoo Jo, Jihoon Han, Jonghyun Park, Chungoo Lee, Hyunju BMC Bioinformatics Research BACKGROUND: Computational approaches in the identification of drug targets are expected to reduce time and effort in drug development. Advances in genomics and proteomics provide the opportunity to uncover properties of druggable genomes. Although several studies have been conducted for distinguishing drug targets from non-drug targets, they mainly focus on the sequences and functional roles of proteins. Many other properties of proteins have not been fully investigated. METHODS: Using the DrugBank (version 3.0) database containing nearly 6,816 drug entries including 760 FDA-approved drugs and 1822 of their targets and human UniProt/Swiss-Prot databases, we defined 1578 non-redundant drug target and 17,575 non-drug target proteins. To select these non-redundant protein datasets, we built four datasets (A, B, C, and D) by considering clustering of paralogous proteins. RESULTS: We first reassessed the widely used properties of drug target proteins. We confirmed and extended that drug target proteins (1) are likely to have more hydrophobic, less polar, less PEST sequences, and more signal peptide sequences higher and (2) are more involved in enzyme catalysis, oxidation and reduction in cellular respiration, and operational genes. In this study, we proposed new properties (essentiality, expression pattern, PTMs, and solvent accessibility) for effectively identifying drug target proteins. We found that (1) drug targetability and protein essentiality are decoupled, (2) druggability of proteins has high expression level and tissue specificity, and (3) functional post-translational modification residues are enriched in drug target proteins. In addition, to predict the drug targetability of proteins, we exploited two machine learning methods (Support Vector Machine and Random Forest). When we predicted drug targets by combining previously known protein properties and proposed new properties, an F-score of 0.8307 was obtained. CONCLUSIONS: When the newly proposed properties are integrated, the prediction performance is improved and these properties are related to drug targets. We believe that our study will provide a new aspect in inferring drug-target interactions. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1639-3) contains supplementary material, which is available to authorized users. BioMed Central 2017-05-31 /pmc/articles/PMC5471946/ /pubmed/28617227 http://dx.doi.org/10.1186/s12859-017-1639-3 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Kim, Baeksoo
Jo, Jihoon
Han, Jonghyun
Park, Chungoo
Lee, Hyunju
In silico re-identification of properties of drug target proteins
title In silico re-identification of properties of drug target proteins
title_full In silico re-identification of properties of drug target proteins
title_fullStr In silico re-identification of properties of drug target proteins
title_full_unstemmed In silico re-identification of properties of drug target proteins
title_short In silico re-identification of properties of drug target proteins
title_sort in silico re-identification of properties of drug target proteins
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5471946/
https://www.ncbi.nlm.nih.gov/pubmed/28617227
http://dx.doi.org/10.1186/s12859-017-1639-3
work_keys_str_mv AT kimbaeksoo insilicoreidentificationofpropertiesofdrugtargetproteins
AT jojihoon insilicoreidentificationofpropertiesofdrugtargetproteins
AT hanjonghyun insilicoreidentificationofpropertiesofdrugtargetproteins
AT parkchungoo insilicoreidentificationofpropertiesofdrugtargetproteins
AT leehyunju insilicoreidentificationofpropertiesofdrugtargetproteins