Cargando…
Patch seriation to visualize data and model parameters
We developed a new seriation merit function for enhancing the visual information of data matrices. A local similarity matrix is calculated, where the average similarity of neighbouring objects is calculated in a limited variable space and a global function is constructed to maximize the local simila...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer International Publishing
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10492365/ https://www.ncbi.nlm.nih.gov/pubmed/37689697 http://dx.doi.org/10.1186/s13321-023-00757-1 |
_version_ | 1785104240142712832 |
---|---|
author | Lasfar, Rita Tóth, Gergely |
author_facet | Lasfar, Rita Tóth, Gergely |
author_sort | Lasfar, Rita |
collection | PubMed |
description | We developed a new seriation merit function for enhancing the visual information of data matrices. A local similarity matrix is calculated, where the average similarity of neighbouring objects is calculated in a limited variable space and a global function is constructed to maximize the local similarities and cluster them into patches by simple row and column ordering. The method identifies data clusters in a powerful way, if the similarity of objects is caused by some variables and these variables differ for the distinct clusters. The method can be used in the presence of missing data and also on more than two-dimensional data arrays. We show the feasibility of the method on different data sets: on QSAR, chemical, material science, food science, cheminformatics and environmental data in two- and three-dimensional cases. The method can be used during the development and the interpretation of artificial neural network models by seriating different features of the models. It helps to identify interpretable models by elucidating clusters of objects, variables and hidden layer neurons. GRAPHICAL ABSTRACT: [Image: see text] SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13321-023-00757-1. |
format | Online Article Text |
id | pubmed-10492365 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Springer International Publishing |
record_format | MEDLINE/PubMed |
spelling | pubmed-104923652023-09-10 Patch seriation to visualize data and model parameters Lasfar, Rita Tóth, Gergely J Cheminform Research We developed a new seriation merit function for enhancing the visual information of data matrices. A local similarity matrix is calculated, where the average similarity of neighbouring objects is calculated in a limited variable space and a global function is constructed to maximize the local similarities and cluster them into patches by simple row and column ordering. The method identifies data clusters in a powerful way, if the similarity of objects is caused by some variables and these variables differ for the distinct clusters. The method can be used in the presence of missing data and also on more than two-dimensional data arrays. We show the feasibility of the method on different data sets: on QSAR, chemical, material science, food science, cheminformatics and environmental data in two- and three-dimensional cases. The method can be used during the development and the interpretation of artificial neural network models by seriating different features of the models. It helps to identify interpretable models by elucidating clusters of objects, variables and hidden layer neurons. GRAPHICAL ABSTRACT: [Image: see text] SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13321-023-00757-1. Springer International Publishing 2023-09-09 /pmc/articles/PMC10492365/ /pubmed/37689697 http://dx.doi.org/10.1186/s13321-023-00757-1 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Lasfar, Rita Tóth, Gergely Patch seriation to visualize data and model parameters |
title | Patch seriation to visualize data and model parameters |
title_full | Patch seriation to visualize data and model parameters |
title_fullStr | Patch seriation to visualize data and model parameters |
title_full_unstemmed | Patch seriation to visualize data and model parameters |
title_short | Patch seriation to visualize data and model parameters |
title_sort | patch seriation to visualize data and model parameters |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10492365/ https://www.ncbi.nlm.nih.gov/pubmed/37689697 http://dx.doi.org/10.1186/s13321-023-00757-1 |
work_keys_str_mv | AT lasfarrita patchseriationtovisualizedataandmodelparameters AT tothgergely patchseriationtovisualizedataandmodelparameters |