Cargando…
The Resolved Mutual Information Function as a Structural Fingerprint of Biomolecular Sequences for Interpretable Machine Learning Classifiers
In the present article we propose the application of variants of the mutual information function as characteristic fingerprints of biomolecular sequences for classification analysis. In particular, we consider the resolved mutual information functions based on Shannon-, Rényi-, and Tsallis-entropy....
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8534762/ https://www.ncbi.nlm.nih.gov/pubmed/34682081 http://dx.doi.org/10.3390/e23101357 |
_version_ | 1784587622769754112 |
---|---|
author | Bohnsack, Katrin Sophie Kaden, Marika Abel, Julia Saralajew, Sascha Villmann, Thomas |
author_facet | Bohnsack, Katrin Sophie Kaden, Marika Abel, Julia Saralajew, Sascha Villmann, Thomas |
author_sort | Bohnsack, Katrin Sophie |
collection | PubMed |
description | In the present article we propose the application of variants of the mutual information function as characteristic fingerprints of biomolecular sequences for classification analysis. In particular, we consider the resolved mutual information functions based on Shannon-, Rényi-, and Tsallis-entropy. In combination with interpretable machine learning classifier models based on generalized learning vector quantization, a powerful methodology for sequence classification is achieved which allows substantial knowledge extraction in addition to the high classification ability due to the model-inherent robustness. Any potential (slightly) inferior performance of the used classifier is compensated by the additional knowledge provided by interpretable models. This knowledge may assist the user in the analysis and understanding of the used data and considered task. After theoretical justification of the concepts, we demonstrate the approach for various example data sets covering different areas in biomolecular sequence analysis. |
format | Online Article Text |
id | pubmed-8534762 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-85347622021-10-23 The Resolved Mutual Information Function as a Structural Fingerprint of Biomolecular Sequences for Interpretable Machine Learning Classifiers Bohnsack, Katrin Sophie Kaden, Marika Abel, Julia Saralajew, Sascha Villmann, Thomas Entropy (Basel) Article In the present article we propose the application of variants of the mutual information function as characteristic fingerprints of biomolecular sequences for classification analysis. In particular, we consider the resolved mutual information functions based on Shannon-, Rényi-, and Tsallis-entropy. In combination with interpretable machine learning classifier models based on generalized learning vector quantization, a powerful methodology for sequence classification is achieved which allows substantial knowledge extraction in addition to the high classification ability due to the model-inherent robustness. Any potential (slightly) inferior performance of the used classifier is compensated by the additional knowledge provided by interpretable models. This knowledge may assist the user in the analysis and understanding of the used data and considered task. After theoretical justification of the concepts, we demonstrate the approach for various example data sets covering different areas in biomolecular sequence analysis. MDPI 2021-10-17 /pmc/articles/PMC8534762/ /pubmed/34682081 http://dx.doi.org/10.3390/e23101357 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Bohnsack, Katrin Sophie Kaden, Marika Abel, Julia Saralajew, Sascha Villmann, Thomas The Resolved Mutual Information Function as a Structural Fingerprint of Biomolecular Sequences for Interpretable Machine Learning Classifiers |
title | The Resolved Mutual Information Function as a Structural Fingerprint of Biomolecular Sequences for Interpretable Machine Learning Classifiers |
title_full | The Resolved Mutual Information Function as a Structural Fingerprint of Biomolecular Sequences for Interpretable Machine Learning Classifiers |
title_fullStr | The Resolved Mutual Information Function as a Structural Fingerprint of Biomolecular Sequences for Interpretable Machine Learning Classifiers |
title_full_unstemmed | The Resolved Mutual Information Function as a Structural Fingerprint of Biomolecular Sequences for Interpretable Machine Learning Classifiers |
title_short | The Resolved Mutual Information Function as a Structural Fingerprint of Biomolecular Sequences for Interpretable Machine Learning Classifiers |
title_sort | resolved mutual information function as a structural fingerprint of biomolecular sequences for interpretable machine learning classifiers |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8534762/ https://www.ncbi.nlm.nih.gov/pubmed/34682081 http://dx.doi.org/10.3390/e23101357 |
work_keys_str_mv | AT bohnsackkatrinsophie theresolvedmutualinformationfunctionasastructuralfingerprintofbiomolecularsequencesforinterpretablemachinelearningclassifiers AT kadenmarika theresolvedmutualinformationfunctionasastructuralfingerprintofbiomolecularsequencesforinterpretablemachinelearningclassifiers AT abeljulia theresolvedmutualinformationfunctionasastructuralfingerprintofbiomolecularsequencesforinterpretablemachinelearningclassifiers AT saralajewsascha theresolvedmutualinformationfunctionasastructuralfingerprintofbiomolecularsequencesforinterpretablemachinelearningclassifiers AT villmannthomas theresolvedmutualinformationfunctionasastructuralfingerprintofbiomolecularsequencesforinterpretablemachinelearningclassifiers AT bohnsackkatrinsophie resolvedmutualinformationfunctionasastructuralfingerprintofbiomolecularsequencesforinterpretablemachinelearningclassifiers AT kadenmarika resolvedmutualinformationfunctionasastructuralfingerprintofbiomolecularsequencesforinterpretablemachinelearningclassifiers AT abeljulia resolvedmutualinformationfunctionasastructuralfingerprintofbiomolecularsequencesforinterpretablemachinelearningclassifiers AT saralajewsascha resolvedmutualinformationfunctionasastructuralfingerprintofbiomolecularsequencesforinterpretablemachinelearningclassifiers AT villmannthomas resolvedmutualinformationfunctionasastructuralfingerprintofbiomolecularsequencesforinterpretablemachinelearningclassifiers |