Cargando…

The Resolved Mutual Information Function as a Structural Fingerprint of Biomolecular Sequences for Interpretable Machine Learning Classifiers

In the present article we propose the application of variants of the mutual information function as characteristic fingerprints of biomolecular sequences for classification analysis. In particular, we consider the resolved mutual information functions based on Shannon-, Rényi-, and Tsallis-entropy....

Descripción completa

Detalles Bibliográficos
Autores principales: Bohnsack, Katrin Sophie, Kaden, Marika, Abel, Julia, Saralajew, Sascha, Villmann, Thomas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8534762/
https://www.ncbi.nlm.nih.gov/pubmed/34682081
http://dx.doi.org/10.3390/e23101357
_version_ 1784587622769754112
author Bohnsack, Katrin Sophie
Kaden, Marika
Abel, Julia
Saralajew, Sascha
Villmann, Thomas
author_facet Bohnsack, Katrin Sophie
Kaden, Marika
Abel, Julia
Saralajew, Sascha
Villmann, Thomas
author_sort Bohnsack, Katrin Sophie
collection PubMed
description In the present article we propose the application of variants of the mutual information function as characteristic fingerprints of biomolecular sequences for classification analysis. In particular, we consider the resolved mutual information functions based on Shannon-, Rényi-, and Tsallis-entropy. In combination with interpretable machine learning classifier models based on generalized learning vector quantization, a powerful methodology for sequence classification is achieved which allows substantial knowledge extraction in addition to the high classification ability due to the model-inherent robustness. Any potential (slightly) inferior performance of the used classifier is compensated by the additional knowledge provided by interpretable models. This knowledge may assist the user in the analysis and understanding of the used data and considered task. After theoretical justification of the concepts, we demonstrate the approach for various example data sets covering different areas in biomolecular sequence analysis.
format Online
Article
Text
id pubmed-8534762
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-85347622021-10-23 The Resolved Mutual Information Function as a Structural Fingerprint of Biomolecular Sequences for Interpretable Machine Learning Classifiers Bohnsack, Katrin Sophie Kaden, Marika Abel, Julia Saralajew, Sascha Villmann, Thomas Entropy (Basel) Article In the present article we propose the application of variants of the mutual information function as characteristic fingerprints of biomolecular sequences for classification analysis. In particular, we consider the resolved mutual information functions based on Shannon-, Rényi-, and Tsallis-entropy. In combination with interpretable machine learning classifier models based on generalized learning vector quantization, a powerful methodology for sequence classification is achieved which allows substantial knowledge extraction in addition to the high classification ability due to the model-inherent robustness. Any potential (slightly) inferior performance of the used classifier is compensated by the additional knowledge provided by interpretable models. This knowledge may assist the user in the analysis and understanding of the used data and considered task. After theoretical justification of the concepts, we demonstrate the approach for various example data sets covering different areas in biomolecular sequence analysis. MDPI 2021-10-17 /pmc/articles/PMC8534762/ /pubmed/34682081 http://dx.doi.org/10.3390/e23101357 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Bohnsack, Katrin Sophie
Kaden, Marika
Abel, Julia
Saralajew, Sascha
Villmann, Thomas
The Resolved Mutual Information Function as a Structural Fingerprint of Biomolecular Sequences for Interpretable Machine Learning Classifiers
title The Resolved Mutual Information Function as a Structural Fingerprint of Biomolecular Sequences for Interpretable Machine Learning Classifiers
title_full The Resolved Mutual Information Function as a Structural Fingerprint of Biomolecular Sequences for Interpretable Machine Learning Classifiers
title_fullStr The Resolved Mutual Information Function as a Structural Fingerprint of Biomolecular Sequences for Interpretable Machine Learning Classifiers
title_full_unstemmed The Resolved Mutual Information Function as a Structural Fingerprint of Biomolecular Sequences for Interpretable Machine Learning Classifiers
title_short The Resolved Mutual Information Function as a Structural Fingerprint of Biomolecular Sequences for Interpretable Machine Learning Classifiers
title_sort resolved mutual information function as a structural fingerprint of biomolecular sequences for interpretable machine learning classifiers
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8534762/
https://www.ncbi.nlm.nih.gov/pubmed/34682081
http://dx.doi.org/10.3390/e23101357
work_keys_str_mv AT bohnsackkatrinsophie theresolvedmutualinformationfunctionasastructuralfingerprintofbiomolecularsequencesforinterpretablemachinelearningclassifiers
AT kadenmarika theresolvedmutualinformationfunctionasastructuralfingerprintofbiomolecularsequencesforinterpretablemachinelearningclassifiers
AT abeljulia theresolvedmutualinformationfunctionasastructuralfingerprintofbiomolecularsequencesforinterpretablemachinelearningclassifiers
AT saralajewsascha theresolvedmutualinformationfunctionasastructuralfingerprintofbiomolecularsequencesforinterpretablemachinelearningclassifiers
AT villmannthomas theresolvedmutualinformationfunctionasastructuralfingerprintofbiomolecularsequencesforinterpretablemachinelearningclassifiers
AT bohnsackkatrinsophie resolvedmutualinformationfunctionasastructuralfingerprintofbiomolecularsequencesforinterpretablemachinelearningclassifiers
AT kadenmarika resolvedmutualinformationfunctionasastructuralfingerprintofbiomolecularsequencesforinterpretablemachinelearningclassifiers
AT abeljulia resolvedmutualinformationfunctionasastructuralfingerprintofbiomolecularsequencesforinterpretablemachinelearningclassifiers
AT saralajewsascha resolvedmutualinformationfunctionasastructuralfingerprintofbiomolecularsequencesforinterpretablemachinelearningclassifiers
AT villmannthomas resolvedmutualinformationfunctionasastructuralfingerprintofbiomolecularsequencesforinterpretablemachinelearningclassifiers