Cargando…

Integration of various protein similarities using random forest technique to infer augmented drug-protein matrix for enhancing drug-disease association prediction

Identifying new therapeutic indications for existing drugs is a major challenge in drug repositioning. Most computational drug repositioning methods focus on known targets. Analyzing multiple aspects of various protein associations provides an opportunity to discover underlying drug-associated prote...

Descripción completa

Detalles Bibliográficos
Autores principales: Kitsiranuwat, Satanat, Suratanee, Apichat, Plaimas, Kitiporn
Formato: Online Artículo Texto
Lenguaje:English
Publicado: SAGE Publications 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10358641/
https://www.ncbi.nlm.nih.gov/pubmed/35801312
http://dx.doi.org/10.1177/00368504221109215
_version_ 1785075709290479616
author Kitsiranuwat, Satanat
Suratanee, Apichat
Plaimas, Kitiporn
author_facet Kitsiranuwat, Satanat
Suratanee, Apichat
Plaimas, Kitiporn
author_sort Kitsiranuwat, Satanat
collection PubMed
description Identifying new therapeutic indications for existing drugs is a major challenge in drug repositioning. Most computational drug repositioning methods focus on known targets. Analyzing multiple aspects of various protein associations provides an opportunity to discover underlying drug-associated proteins that can be used to improve the performance of the drug repositioning approaches. In this study, machine learning models were developed based on the similarities of diversified biological features, including protein interaction, topological network, sequence alignment, and biological function to predict protein pairs associating with the same drugs. The crucial set of features was identified, and the high performances of protein pair predictions were achieved with an area under the curve (AUC) value of more than 93%. Based on drug chemical structures, the drug similarity levels of the promising protein pairs were used to quantify the inferred drug-associated proteins. Furthermore, these proteins were employed to establish an augmented drug-protein matrix to enhance the efficiency of three existing drug repositioning techniques: a similarity constrained matrix factorization for the drug-disease associations (SCMFDD), an ensemble meta-paths and singular value decomposition (EMP-SVD) model, and a topology similarity and singular value decomposition (TS-SVD) technique. The results showed that the augmented matrix helped to improve the performance up to 4% more in comparison to the original matrix for SCMFDD and EMP-SVD, and about 1% more for TS-SVD. In summary, inferring new protein pairs related to the same drugs increase the opportunity to reveal missing drug-associated proteins that are important for drug development via the drug repositioning technique.
format Online
Article
Text
id pubmed-10358641
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher SAGE Publications
record_format MEDLINE/PubMed
spelling pubmed-103586412023-08-09 Integration of various protein similarities using random forest technique to infer augmented drug-protein matrix for enhancing drug-disease association prediction Kitsiranuwat, Satanat Suratanee, Apichat Plaimas, Kitiporn Sci Prog Original Manuscript Identifying new therapeutic indications for existing drugs is a major challenge in drug repositioning. Most computational drug repositioning methods focus on known targets. Analyzing multiple aspects of various protein associations provides an opportunity to discover underlying drug-associated proteins that can be used to improve the performance of the drug repositioning approaches. In this study, machine learning models were developed based on the similarities of diversified biological features, including protein interaction, topological network, sequence alignment, and biological function to predict protein pairs associating with the same drugs. The crucial set of features was identified, and the high performances of protein pair predictions were achieved with an area under the curve (AUC) value of more than 93%. Based on drug chemical structures, the drug similarity levels of the promising protein pairs were used to quantify the inferred drug-associated proteins. Furthermore, these proteins were employed to establish an augmented drug-protein matrix to enhance the efficiency of three existing drug repositioning techniques: a similarity constrained matrix factorization for the drug-disease associations (SCMFDD), an ensemble meta-paths and singular value decomposition (EMP-SVD) model, and a topology similarity and singular value decomposition (TS-SVD) technique. The results showed that the augmented matrix helped to improve the performance up to 4% more in comparison to the original matrix for SCMFDD and EMP-SVD, and about 1% more for TS-SVD. In summary, inferring new protein pairs related to the same drugs increase the opportunity to reveal missing drug-associated proteins that are important for drug development via the drug repositioning technique. SAGE Publications 2022-07-08 /pmc/articles/PMC10358641/ /pubmed/35801312 http://dx.doi.org/10.1177/00368504221109215 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by-nc/4.0/This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (https://creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access page (https://us.sagepub.com/en-us/nam/open-access-at-sage).
spellingShingle Original Manuscript
Kitsiranuwat, Satanat
Suratanee, Apichat
Plaimas, Kitiporn
Integration of various protein similarities using random forest technique to infer augmented drug-protein matrix for enhancing drug-disease association prediction
title Integration of various protein similarities using random forest technique to infer augmented drug-protein matrix for enhancing drug-disease association prediction
title_full Integration of various protein similarities using random forest technique to infer augmented drug-protein matrix for enhancing drug-disease association prediction
title_fullStr Integration of various protein similarities using random forest technique to infer augmented drug-protein matrix for enhancing drug-disease association prediction
title_full_unstemmed Integration of various protein similarities using random forest technique to infer augmented drug-protein matrix for enhancing drug-disease association prediction
title_short Integration of various protein similarities using random forest technique to infer augmented drug-protein matrix for enhancing drug-disease association prediction
title_sort integration of various protein similarities using random forest technique to infer augmented drug-protein matrix for enhancing drug-disease association prediction
topic Original Manuscript
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10358641/
https://www.ncbi.nlm.nih.gov/pubmed/35801312
http://dx.doi.org/10.1177/00368504221109215
work_keys_str_mv AT kitsiranuwatsatanat integrationofvariousproteinsimilaritiesusingrandomforesttechniquetoinferaugmenteddrugproteinmatrixforenhancingdrugdiseaseassociationprediction
AT surataneeapichat integrationofvariousproteinsimilaritiesusingrandomforesttechniquetoinferaugmenteddrugproteinmatrixforenhancingdrugdiseaseassociationprediction
AT plaimaskitiporn integrationofvariousproteinsimilaritiesusingrandomforesttechniquetoinferaugmenteddrugproteinmatrixforenhancingdrugdiseaseassociationprediction