Cargando…
Drug Target Identification with Machine Learning: How to Choose Negative Examples
Identification of the protein targets of hit molecules is essential in the drug discovery process. Target prediction with machine learning algorithms can help accelerate this search, limiting the number of required experiments. However, Drug-Target Interactions databases used for training present hi...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8151112/ https://www.ncbi.nlm.nih.gov/pubmed/34066072 http://dx.doi.org/10.3390/ijms22105118 |
_version_ | 1783698306876047360 |
---|---|
author | Najm, Matthieu Azencott, Chloé-Agathe Playe, Benoit Stoven, Véronique |
author_facet | Najm, Matthieu Azencott, Chloé-Agathe Playe, Benoit Stoven, Véronique |
author_sort | Najm, Matthieu |
collection | PubMed |
description | Identification of the protein targets of hit molecules is essential in the drug discovery process. Target prediction with machine learning algorithms can help accelerate this search, limiting the number of required experiments. However, Drug-Target Interactions databases used for training present high statistical bias, leading to a high number of false positives, thus increasing time and cost of experimental validation campaigns. To minimize the number of false positives among predicted targets, we propose a new scheme for choosing negative examples, so that each protein and each drug appears an equal number of times in positive and negative examples. We artificially reproduce the process of target identification for three specific drugs, and more globally for 200 approved drugs. For the detailed three drug examples, and for the larger set of 200 drugs, training with the proposed scheme for the choice of negative examples improved target prediction results: the average number of false positives among the top ranked predicted targets decreased, and overall, the rank of the true targets was improved.Our method corrects databases’ statistical bias and reduces the number of false positive predictions, and therefore the number of useless experiments potentially undertaken. |
format | Online Article Text |
id | pubmed-8151112 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-81511122021-05-27 Drug Target Identification with Machine Learning: How to Choose Negative Examples Najm, Matthieu Azencott, Chloé-Agathe Playe, Benoit Stoven, Véronique Int J Mol Sci Article Identification of the protein targets of hit molecules is essential in the drug discovery process. Target prediction with machine learning algorithms can help accelerate this search, limiting the number of required experiments. However, Drug-Target Interactions databases used for training present high statistical bias, leading to a high number of false positives, thus increasing time and cost of experimental validation campaigns. To minimize the number of false positives among predicted targets, we propose a new scheme for choosing negative examples, so that each protein and each drug appears an equal number of times in positive and negative examples. We artificially reproduce the process of target identification for three specific drugs, and more globally for 200 approved drugs. For the detailed three drug examples, and for the larger set of 200 drugs, training with the proposed scheme for the choice of negative examples improved target prediction results: the average number of false positives among the top ranked predicted targets decreased, and overall, the rank of the true targets was improved.Our method corrects databases’ statistical bias and reduces the number of false positive predictions, and therefore the number of useless experiments potentially undertaken. MDPI 2021-05-12 /pmc/articles/PMC8151112/ /pubmed/34066072 http://dx.doi.org/10.3390/ijms22105118 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Najm, Matthieu Azencott, Chloé-Agathe Playe, Benoit Stoven, Véronique Drug Target Identification with Machine Learning: How to Choose Negative Examples |
title | Drug Target Identification with Machine Learning: How to Choose Negative Examples |
title_full | Drug Target Identification with Machine Learning: How to Choose Negative Examples |
title_fullStr | Drug Target Identification with Machine Learning: How to Choose Negative Examples |
title_full_unstemmed | Drug Target Identification with Machine Learning: How to Choose Negative Examples |
title_short | Drug Target Identification with Machine Learning: How to Choose Negative Examples |
title_sort | drug target identification with machine learning: how to choose negative examples |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8151112/ https://www.ncbi.nlm.nih.gov/pubmed/34066072 http://dx.doi.org/10.3390/ijms22105118 |
work_keys_str_mv | AT najmmatthieu drugtargetidentificationwithmachinelearninghowtochoosenegativeexamples AT azencottchloeagathe drugtargetidentificationwithmachinelearninghowtochoosenegativeexamples AT playebenoit drugtargetidentificationwithmachinelearninghowtochoosenegativeexamples AT stovenveronique drugtargetidentificationwithmachinelearninghowtochoosenegativeexamples |