Cargando…
SYBA: Bayesian estimation of synthetic accessibility of organic compounds
SYBA (SYnthetic Bayesian Accessibility) is a fragment-based method for the rapid classification of organic compounds as easy- (ES) or hard-to-synthesize (HS). It is based on a Bernoulli naïve Bayes classifier that is used to assign SYBA score contributions to individual fragments based on their freq...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer International Publishing
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7238540/ https://www.ncbi.nlm.nih.gov/pubmed/33431015 http://dx.doi.org/10.1186/s13321-020-00439-2 |
_version_ | 1783536551536361472 |
---|---|
author | Voršilák, Milan Kolář, Michal Čmelo, Ivan Svozil, Daniel |
author_facet | Voršilák, Milan Kolář, Michal Čmelo, Ivan Svozil, Daniel |
author_sort | Voršilák, Milan |
collection | PubMed |
description | SYBA (SYnthetic Bayesian Accessibility) is a fragment-based method for the rapid classification of organic compounds as easy- (ES) or hard-to-synthesize (HS). It is based on a Bernoulli naïve Bayes classifier that is used to assign SYBA score contributions to individual fragments based on their frequencies in the database of ES and HS molecules. SYBA was trained on ES molecules available in the ZINC15 database and on HS molecules generated by the Nonpher methodology. SYBA was compared with a random forest, that was utilized as a baseline method, as well as with other two methods for synthetic accessibility assessment: SAScore and SCScore. When used with their suggested thresholds, SYBA improves over random forest classification, albeit marginally, and outperforms SAScore and SCScore. However, upon the optimization of SAScore threshold (that changes from 6.0 to – 4.5), SAScore yields similar results as SYBA. Because SYBA is based merely on fragment contributions, it can be used for the analysis of the contribution of individual molecular parts to compound synthetic accessibility. SYBA is publicly available at https://github.com/lich-uct/syba under the GNU General Public License. |
format | Online Article Text |
id | pubmed-7238540 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Springer International Publishing |
record_format | MEDLINE/PubMed |
spelling | pubmed-72385402020-05-27 SYBA: Bayesian estimation of synthetic accessibility of organic compounds Voršilák, Milan Kolář, Michal Čmelo, Ivan Svozil, Daniel J Cheminform Research Article SYBA (SYnthetic Bayesian Accessibility) is a fragment-based method for the rapid classification of organic compounds as easy- (ES) or hard-to-synthesize (HS). It is based on a Bernoulli naïve Bayes classifier that is used to assign SYBA score contributions to individual fragments based on their frequencies in the database of ES and HS molecules. SYBA was trained on ES molecules available in the ZINC15 database and on HS molecules generated by the Nonpher methodology. SYBA was compared with a random forest, that was utilized as a baseline method, as well as with other two methods for synthetic accessibility assessment: SAScore and SCScore. When used with their suggested thresholds, SYBA improves over random forest classification, albeit marginally, and outperforms SAScore and SCScore. However, upon the optimization of SAScore threshold (that changes from 6.0 to – 4.5), SAScore yields similar results as SYBA. Because SYBA is based merely on fragment contributions, it can be used for the analysis of the contribution of individual molecular parts to compound synthetic accessibility. SYBA is publicly available at https://github.com/lich-uct/syba under the GNU General Public License. Springer International Publishing 2020-05-20 /pmc/articles/PMC7238540/ /pubmed/33431015 http://dx.doi.org/10.1186/s13321-020-00439-2 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Article Voršilák, Milan Kolář, Michal Čmelo, Ivan Svozil, Daniel SYBA: Bayesian estimation of synthetic accessibility of organic compounds |
title | SYBA: Bayesian estimation of synthetic accessibility of organic compounds |
title_full | SYBA: Bayesian estimation of synthetic accessibility of organic compounds |
title_fullStr | SYBA: Bayesian estimation of synthetic accessibility of organic compounds |
title_full_unstemmed | SYBA: Bayesian estimation of synthetic accessibility of organic compounds |
title_short | SYBA: Bayesian estimation of synthetic accessibility of organic compounds |
title_sort | syba: bayesian estimation of synthetic accessibility of organic compounds |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7238540/ https://www.ncbi.nlm.nih.gov/pubmed/33431015 http://dx.doi.org/10.1186/s13321-020-00439-2 |
work_keys_str_mv | AT vorsilakmilan sybabayesianestimationofsyntheticaccessibilityoforganiccompounds AT kolarmichal sybabayesianestimationofsyntheticaccessibilityoforganiccompounds AT cmeloivan sybabayesianestimationofsyntheticaccessibilityoforganiccompounds AT svozildaniel sybabayesianestimationofsyntheticaccessibilityoforganiccompounds |