Cargando…

Addressing Noise and Estimating Uncertainty in Biomedical Data through the Exploration of Chemical Space

Noise is a basic ingredient in data, since observed data are always contaminated by unwanted deviations, i.e., noise, which, in the case of overdetermined systems (with more data than model parameters), cause the corresponding linear system of equations to have an imperfect solution. In addition, in...

Descripción completa

Detalles Bibliográficos
Autores principales: deAndrés-Galiana, Enrique J., Fernández-Martínez, Juan Luis, Fernández-Brillet, Lucas, Cernea, Ana, Kloczkowski, Andrzej
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9656407/
https://www.ncbi.nlm.nih.gov/pubmed/36361765
http://dx.doi.org/10.3390/ijms232112975
_version_ 1784829426434834432
author deAndrés-Galiana, Enrique J.
Fernández-Martínez, Juan Luis
Fernández-Brillet, Lucas
Cernea, Ana
Kloczkowski, Andrzej
author_facet deAndrés-Galiana, Enrique J.
Fernández-Martínez, Juan Luis
Fernández-Brillet, Lucas
Cernea, Ana
Kloczkowski, Andrzej
author_sort deAndrés-Galiana, Enrique J.
collection PubMed
description Noise is a basic ingredient in data, since observed data are always contaminated by unwanted deviations, i.e., noise, which, in the case of overdetermined systems (with more data than model parameters), cause the corresponding linear system of equations to have an imperfect solution. In addition, in the case of highly underdetermined parameterization, noise can be absorbed by the model, generating spurious solutions. This is a very undesirable situation that might lead to incorrect conclusions. We presented mathematical formalism based on the inverse problem theory combined with artificial intelligence methodologies to perform an enhanced sampling of noisy biomedical data to improve the finding of meaningful solutions. Random sampling methods fail for high-dimensional biomedical problems. Sampling methods such as smart model parameterizations, forward surrogates, and parallel computing are better suited for such problems. We applied these methods to several important biomedical problems, such as phenotype prediction and a problem related to predicting the effects of protein mutations, i.e., if a given single residue mutation is neutral or deleterious, causing a disease. We also applied these methods to de novo drug discovery and drug repositioning (repurposing) through the enhanced exploration of huge chemical space. The purpose of these novel methods that address the problem of noise and uncertainty in biomedical data is to find new therapeutic solutions, perform drug repurposing, and accelerate and optimize drug discovery, thus reestablishing homeostasis. Finding the right target, the right compound, and the right patient are the three bottlenecks to running successful clinical trials from the correct analysis of preclinical models. Artificial intelligence can provide a solution to these problems, considering that the character of the data restricts the quality of the prediction, as in any modeling procedure in data analysis. The use of simple and plain methodologies is crucial to tackling these important and challenging problems, particularly drug repositioning/repurposing in rare diseases.
format Online
Article
Text
id pubmed-9656407
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-96564072022-11-15 Addressing Noise and Estimating Uncertainty in Biomedical Data through the Exploration of Chemical Space deAndrés-Galiana, Enrique J. Fernández-Martínez, Juan Luis Fernández-Brillet, Lucas Cernea, Ana Kloczkowski, Andrzej Int J Mol Sci Review Noise is a basic ingredient in data, since observed data are always contaminated by unwanted deviations, i.e., noise, which, in the case of overdetermined systems (with more data than model parameters), cause the corresponding linear system of equations to have an imperfect solution. In addition, in the case of highly underdetermined parameterization, noise can be absorbed by the model, generating spurious solutions. This is a very undesirable situation that might lead to incorrect conclusions. We presented mathematical formalism based on the inverse problem theory combined with artificial intelligence methodologies to perform an enhanced sampling of noisy biomedical data to improve the finding of meaningful solutions. Random sampling methods fail for high-dimensional biomedical problems. Sampling methods such as smart model parameterizations, forward surrogates, and parallel computing are better suited for such problems. We applied these methods to several important biomedical problems, such as phenotype prediction and a problem related to predicting the effects of protein mutations, i.e., if a given single residue mutation is neutral or deleterious, causing a disease. We also applied these methods to de novo drug discovery and drug repositioning (repurposing) through the enhanced exploration of huge chemical space. The purpose of these novel methods that address the problem of noise and uncertainty in biomedical data is to find new therapeutic solutions, perform drug repurposing, and accelerate and optimize drug discovery, thus reestablishing homeostasis. Finding the right target, the right compound, and the right patient are the three bottlenecks to running successful clinical trials from the correct analysis of preclinical models. Artificial intelligence can provide a solution to these problems, considering that the character of the data restricts the quality of the prediction, as in any modeling procedure in data analysis. The use of simple and plain methodologies is crucial to tackling these important and challenging problems, particularly drug repositioning/repurposing in rare diseases. MDPI 2022-10-26 /pmc/articles/PMC9656407/ /pubmed/36361765 http://dx.doi.org/10.3390/ijms232112975 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Review
deAndrés-Galiana, Enrique J.
Fernández-Martínez, Juan Luis
Fernández-Brillet, Lucas
Cernea, Ana
Kloczkowski, Andrzej
Addressing Noise and Estimating Uncertainty in Biomedical Data through the Exploration of Chemical Space
title Addressing Noise and Estimating Uncertainty in Biomedical Data through the Exploration of Chemical Space
title_full Addressing Noise and Estimating Uncertainty in Biomedical Data through the Exploration of Chemical Space
title_fullStr Addressing Noise and Estimating Uncertainty in Biomedical Data through the Exploration of Chemical Space
title_full_unstemmed Addressing Noise and Estimating Uncertainty in Biomedical Data through the Exploration of Chemical Space
title_short Addressing Noise and Estimating Uncertainty in Biomedical Data through the Exploration of Chemical Space
title_sort addressing noise and estimating uncertainty in biomedical data through the exploration of chemical space
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9656407/
https://www.ncbi.nlm.nih.gov/pubmed/36361765
http://dx.doi.org/10.3390/ijms232112975
work_keys_str_mv AT deandresgalianaenriquej addressingnoiseandestimatinguncertaintyinbiomedicaldatathroughtheexplorationofchemicalspace
AT fernandezmartinezjuanluis addressingnoiseandestimatinguncertaintyinbiomedicaldatathroughtheexplorationofchemicalspace
AT fernandezbrilletlucas addressingnoiseandestimatinguncertaintyinbiomedicaldatathroughtheexplorationofchemicalspace
AT cerneaana addressingnoiseandestimatinguncertaintyinbiomedicaldatathroughtheexplorationofchemicalspace
AT kloczkowskiandrzej addressingnoiseandestimatinguncertaintyinbiomedicaldatathroughtheexplorationofchemicalspace