Cargando…

R.ROSETTA: an interpretable machine learning framework

BACKGROUND: Machine learning involves strategies and algorithms that may assist bioinformatics analyses in terms of data mining and knowledge discovery. In several applications, viz. in Life Sciences, it is often more important to understand how a prediction was obtained rather than knowing what pre...

Descripción completa

Detalles Bibliográficos
Autores principales:	Garbulowski, Mateusz, Diamanti, Klev, Smolińska, Karolina, Baltzer, Nicholas, Stoll, Patricia, Bornelöv, Susanne, Øhrn, Aleksander, Feuk, Lars, Komorowski, Jan
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2021
Materias:	Software
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7937228/ https://www.ncbi.nlm.nih.gov/pubmed/33676405 http://dx.doi.org/10.1186/s12859-021-04049-z

_version_	1783661346078851072
author	Garbulowski, Mateusz Diamanti, Klev Smolińska, Karolina Baltzer, Nicholas Stoll, Patricia Bornelöv, Susanne Øhrn, Aleksander Feuk, Lars Komorowski, Jan
author_facet	Garbulowski, Mateusz Diamanti, Klev Smolińska, Karolina Baltzer, Nicholas Stoll, Patricia Bornelöv, Susanne Øhrn, Aleksander Feuk, Lars Komorowski, Jan
author_sort	Garbulowski, Mateusz
collection	PubMed
description	BACKGROUND: Machine learning involves strategies and algorithms that may assist bioinformatics analyses in terms of data mining and knowledge discovery. In several applications, viz. in Life Sciences, it is often more important to understand how a prediction was obtained rather than knowing what prediction was made. To this end so-called interpretable machine learning has been recently advocated. In this study, we implemented an interpretable machine learning package based on the rough set theory. An important aim of our work was provision of statistical properties of the models and their components. RESULTS: We present the R.ROSETTA package, which is an R wrapper of ROSETTA framework. The original ROSETTA functions have been improved and adapted to the R programming environment. The package allows for building and analyzing non-linear interpretable machine learning models. R.ROSETTA gathers combinatorial statistics via rule-based modelling for accessible and transparent results, well-suited for adoption within the greater scientific community. The package also provides statistics and visualization tools that facilitate minimization of analysis bias and noise. The R.ROSETTA package is freely available at https://github.com/komorowskilab/R.ROSETTA. To illustrate the usage of the package, we applied it to a transcriptome dataset from an autism case–control study. Our tool provided hypotheses for potential co-predictive mechanisms among features that discerned phenotype classes. These co-predictors represented neurodevelopmental and autism-related genes. CONCLUSIONS: R.ROSETTA provides new insights for interpretable machine learning analyses and knowledge-based systems. We demonstrated that our package facilitated detection of dependencies for autism-related genes. Although the sample application of R.ROSETTA illustrates transcriptome data analysis, the package can be used to analyze any data organized in decision tables. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04049-z.
format	Online Article Text
id	pubmed-7937228
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-79372282021-03-09 R.ROSETTA: an interpretable machine learning framework Garbulowski, Mateusz Diamanti, Klev Smolińska, Karolina Baltzer, Nicholas Stoll, Patricia Bornelöv, Susanne Øhrn, Aleksander Feuk, Lars Komorowski, Jan BMC Bioinformatics Software BACKGROUND: Machine learning involves strategies and algorithms that may assist bioinformatics analyses in terms of data mining and knowledge discovery. In several applications, viz. in Life Sciences, it is often more important to understand how a prediction was obtained rather than knowing what prediction was made. To this end so-called interpretable machine learning has been recently advocated. In this study, we implemented an interpretable machine learning package based on the rough set theory. An important aim of our work was provision of statistical properties of the models and their components. RESULTS: We present the R.ROSETTA package, which is an R wrapper of ROSETTA framework. The original ROSETTA functions have been improved and adapted to the R programming environment. The package allows for building and analyzing non-linear interpretable machine learning models. R.ROSETTA gathers combinatorial statistics via rule-based modelling for accessible and transparent results, well-suited for adoption within the greater scientific community. The package also provides statistics and visualization tools that facilitate minimization of analysis bias and noise. The R.ROSETTA package is freely available at https://github.com/komorowskilab/R.ROSETTA. To illustrate the usage of the package, we applied it to a transcriptome dataset from an autism case–control study. Our tool provided hypotheses for potential co-predictive mechanisms among features that discerned phenotype classes. These co-predictors represented neurodevelopmental and autism-related genes. CONCLUSIONS: R.ROSETTA provides new insights for interpretable machine learning analyses and knowledge-based systems. We demonstrated that our package facilitated detection of dependencies for autism-related genes. Although the sample application of R.ROSETTA illustrates transcriptome data analysis, the package can be used to analyze any data organized in decision tables. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04049-z. BioMed Central 2021-03-06 /pmc/articles/PMC7937228/ /pubmed/33676405 http://dx.doi.org/10.1186/s12859-021-04049-z Text en © The Author(s) 2021 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle	Software Garbulowski, Mateusz Diamanti, Klev Smolińska, Karolina Baltzer, Nicholas Stoll, Patricia Bornelöv, Susanne Øhrn, Aleksander Feuk, Lars Komorowski, Jan R.ROSETTA: an interpretable machine learning framework
title	R.ROSETTA: an interpretable machine learning framework
title_full	R.ROSETTA: an interpretable machine learning framework
title_fullStr	R.ROSETTA: an interpretable machine learning framework
title_full_unstemmed	R.ROSETTA: an interpretable machine learning framework
title_short	R.ROSETTA: an interpretable machine learning framework
title_sort	r.rosetta: an interpretable machine learning framework
topic	Software
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7937228/ https://www.ncbi.nlm.nih.gov/pubmed/33676405 http://dx.doi.org/10.1186/s12859-021-04049-z
work_keys_str_mv	AT garbulowskimateusz rrosettaaninterpretablemachinelearningframework AT diamantiklev rrosettaaninterpretablemachinelearningframework AT smolinskakarolina rrosettaaninterpretablemachinelearningframework AT baltzernicholas rrosettaaninterpretablemachinelearningframework AT stollpatricia rrosettaaninterpretablemachinelearningframework AT bornelovsusanne rrosettaaninterpretablemachinelearningframework AT øhrnaleksander rrosettaaninterpretablemachinelearningframework AT feuklars rrosettaaninterpretablemachinelearningframework AT komorowskijan rrosettaaninterpretablemachinelearningframework

R.ROSETTA: an interpretable machine learning framework

Ejemplares similares