Cargando…
Machine learning meets pK (a)
We present a small molecule pK (a) prediction tool entirely written in Python. It predicts the macroscopic pK (a) value and is trained on a literature compilation of monoprotic compounds. Different machine learning models were tested and random forest performed best given a five-fold cross-validatio...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
F1000 Research Limited
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7096188/ https://www.ncbi.nlm.nih.gov/pubmed/32226607 http://dx.doi.org/10.12688/f1000research.22090.2 |
_version_ | 1783510766035402752 |
---|---|
author | Baltruschat, Marcel Czodrowski, Paul |
author_facet | Baltruschat, Marcel Czodrowski, Paul |
author_sort | Baltruschat, Marcel |
collection | PubMed |
description | We present a small molecule pK (a) prediction tool entirely written in Python. It predicts the macroscopic pK (a) value and is trained on a literature compilation of monoprotic compounds. Different machine learning models were tested and random forest performed best given a five-fold cross-validation (mean absolute error=0.682, root mean squared error=1.032, correlation coefficient r (2) =0.82). We test our model on two external validation sets, where our model performs comparable to Marvin and is better than a recently published open source model. Our Python tool and all data is freely available at https://github.com/czodrowskilab/Machine-learning-meets-pKa. |
format | Online Article Text |
id | pubmed-7096188 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | F1000 Research Limited |
record_format | MEDLINE/PubMed |
spelling | pubmed-70961882020-03-27 Machine learning meets pK (a) Baltruschat, Marcel Czodrowski, Paul F1000Res Research Article We present a small molecule pK (a) prediction tool entirely written in Python. It predicts the macroscopic pK (a) value and is trained on a literature compilation of monoprotic compounds. Different machine learning models were tested and random forest performed best given a five-fold cross-validation (mean absolute error=0.682, root mean squared error=1.032, correlation coefficient r (2) =0.82). We test our model on two external validation sets, where our model performs comparable to Marvin and is better than a recently published open source model. Our Python tool and all data is freely available at https://github.com/czodrowskilab/Machine-learning-meets-pKa. F1000 Research Limited 2020-04-27 /pmc/articles/PMC7096188/ /pubmed/32226607 http://dx.doi.org/10.12688/f1000research.22090.2 Text en Copyright: © 2020 Baltruschat M and Czodrowski P http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Baltruschat, Marcel Czodrowski, Paul Machine learning meets pK (a) |
title | Machine learning meets pK
(a)
|
title_full | Machine learning meets pK
(a)
|
title_fullStr | Machine learning meets pK
(a)
|
title_full_unstemmed | Machine learning meets pK
(a)
|
title_short | Machine learning meets pK
(a)
|
title_sort | machine learning meets pk
(a) |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7096188/ https://www.ncbi.nlm.nih.gov/pubmed/32226607 http://dx.doi.org/10.12688/f1000research.22090.2 |
work_keys_str_mv | AT baltruschatmarcel machinelearningmeetspka AT czodrowskipaul machinelearningmeetspka |