Cargando…

RANdom SAmple Consensus (RANSAC) algorithm for material-informatics: application to photovoltaic solar cells

An important aspect of chemoinformatics and material-informatics is the usage of machine learning algorithms to build Quantitative Structure Activity Relationship (QSAR) models. The RANdom SAmple Consensus (RANSAC) algorithm is a predictive modeling tool widely used in the image processing field for...

Descripción completa

Detalles Bibliográficos
Autores principales: Kaspi, Omer, Yosipof, Abraham, Senderowitz, Hanoch
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5461245/
https://www.ncbi.nlm.nih.gov/pubmed/29086047
http://dx.doi.org/10.1186/s13321-017-0224-0
_version_ 1783242297250414592
author Kaspi, Omer
Yosipof, Abraham
Senderowitz, Hanoch
author_facet Kaspi, Omer
Yosipof, Abraham
Senderowitz, Hanoch
author_sort Kaspi, Omer
collection PubMed
description An important aspect of chemoinformatics and material-informatics is the usage of machine learning algorithms to build Quantitative Structure Activity Relationship (QSAR) models. The RANdom SAmple Consensus (RANSAC) algorithm is a predictive modeling tool widely used in the image processing field for cleaning datasets from noise. RANSAC could be used as a “one stop shop” algorithm for developing and validating QSAR models, performing outlier removal, descriptors selection, model development and predictions for test set samples using applicability domain. For “future” predictions (i.e., for samples not included in the original test set) RANSAC provides a statistical estimate for the probability of obtaining reliable predictions, i.e., predictions within a pre-defined number of standard deviations from the true values. In this work we describe the first application of RNASAC in material informatics, focusing on the analysis of solar cells. We demonstrate that for three datasets representing different metal oxide (MO) based solar cell libraries RANSAC-derived models select descriptors previously shown to correlate with key photovoltaic properties and lead to good predictive statistics for these properties. These models were subsequently used to predict the properties of virtual solar cells libraries highlighting interesting dependencies of PV properties on MO compositions.
format Online
Article
Text
id pubmed-5461245
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-54612452017-06-22 RANdom SAmple Consensus (RANSAC) algorithm for material-informatics: application to photovoltaic solar cells Kaspi, Omer Yosipof, Abraham Senderowitz, Hanoch J Cheminform Research Article An important aspect of chemoinformatics and material-informatics is the usage of machine learning algorithms to build Quantitative Structure Activity Relationship (QSAR) models. The RANdom SAmple Consensus (RANSAC) algorithm is a predictive modeling tool widely used in the image processing field for cleaning datasets from noise. RANSAC could be used as a “one stop shop” algorithm for developing and validating QSAR models, performing outlier removal, descriptors selection, model development and predictions for test set samples using applicability domain. For “future” predictions (i.e., for samples not included in the original test set) RANSAC provides a statistical estimate for the probability of obtaining reliable predictions, i.e., predictions within a pre-defined number of standard deviations from the true values. In this work we describe the first application of RNASAC in material informatics, focusing on the analysis of solar cells. We demonstrate that for three datasets representing different metal oxide (MO) based solar cell libraries RANSAC-derived models select descriptors previously shown to correlate with key photovoltaic properties and lead to good predictive statistics for these properties. These models were subsequently used to predict the properties of virtual solar cells libraries highlighting interesting dependencies of PV properties on MO compositions. Springer International Publishing 2017-06-06 /pmc/articles/PMC5461245/ /pubmed/29086047 http://dx.doi.org/10.1186/s13321-017-0224-0 Text en © The Author(s) 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Kaspi, Omer
Yosipof, Abraham
Senderowitz, Hanoch
RANdom SAmple Consensus (RANSAC) algorithm for material-informatics: application to photovoltaic solar cells
title RANdom SAmple Consensus (RANSAC) algorithm for material-informatics: application to photovoltaic solar cells
title_full RANdom SAmple Consensus (RANSAC) algorithm for material-informatics: application to photovoltaic solar cells
title_fullStr RANdom SAmple Consensus (RANSAC) algorithm for material-informatics: application to photovoltaic solar cells
title_full_unstemmed RANdom SAmple Consensus (RANSAC) algorithm for material-informatics: application to photovoltaic solar cells
title_short RANdom SAmple Consensus (RANSAC) algorithm for material-informatics: application to photovoltaic solar cells
title_sort random sample consensus (ransac) algorithm for material-informatics: application to photovoltaic solar cells
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5461245/
https://www.ncbi.nlm.nih.gov/pubmed/29086047
http://dx.doi.org/10.1186/s13321-017-0224-0
work_keys_str_mv AT kaspiomer randomsampleconsensusransacalgorithmformaterialinformaticsapplicationtophotovoltaicsolarcells
AT yosipofabraham randomsampleconsensusransacalgorithmformaterialinformaticsapplicationtophotovoltaicsolarcells
AT senderowitzhanoch randomsampleconsensusransacalgorithmformaterialinformaticsapplicationtophotovoltaicsolarcells