Cargando…

Data integration by fuzzy similarity-based hierarchical clustering

BACKGROUND: High throughput methods, in biological and biomedical fields, acquire a large number of molecular parameters or omics data by a single experiment. Combining these omics data can significantly increase the capability for recovering fine-tuned structures or reducing the effects of experime...

Descripción completa

Detalles Bibliográficos
Autores principales:	Ciaramella, Angelo, Nardone, Davide, Staiano, Antonino
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2020
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7446192/ https://www.ncbi.nlm.nih.gov/pubmed/32838739 http://dx.doi.org/10.1186/s12859-020-03567-6

_version_	1783574119452770304
author	Ciaramella, Angelo Nardone, Davide Staiano, Antonino
author_facet	Ciaramella, Angelo Nardone, Davide Staiano, Antonino
author_sort	Ciaramella, Angelo
collection	PubMed
description	BACKGROUND: High throughput methods, in biological and biomedical fields, acquire a large number of molecular parameters or omics data by a single experiment. Combining these omics data can significantly increase the capability for recovering fine-tuned structures or reducing the effects of experimental and biological noise in data. RESULTS: In this work we propose a multi-view integration methodology (named FH-Clust) for identifying patient subgroups from different omics information (e.g., Gene Expression, Mirna Expression, Methylation). In particular, hierarchical structures of patient data are obtained in each omic (or view) and finally their topologies are merged by consensus matrix. One of the main aspects of this methodology, is the use of a measure of dissimilarity between sets of observations, by using an appropriate metric. For each view, a dendrogram is obtained by using a hierarchical clustering based on a fuzzy equivalence relation with Łukasiewicz valued fuzzy similarity. Finally, a consensus matrix, that is a representative information of all dendrograms, is formed by combining multiple hierarchical agglomerations by an approach based on transitive consensus matrix construction. Several experiments and comparisons are made on real data (e.g., Glioblastoma, Prostate Cancer) to assess the proposed approach. CONCLUSIONS: Fuzzy logic allows us to introduce more flexible data agglomeration techniques. From the analysis of scientific literature, it appears to be the first time that a model based on fuzzy logic is used for the agglomeration of multi-omic data. The results suggest that FH-Clust provides better prognostic value and clinical significance compared to the analysis of single-omic data alone and it is very competitive with respect to other techniques from literature.
format	Online Article Text
id	pubmed-7446192
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-74461922020-08-26 Data integration by fuzzy similarity-based hierarchical clustering Ciaramella, Angelo Nardone, Davide Staiano, Antonino BMC Bioinformatics Research BACKGROUND: High throughput methods, in biological and biomedical fields, acquire a large number of molecular parameters or omics data by a single experiment. Combining these omics data can significantly increase the capability for recovering fine-tuned structures or reducing the effects of experimental and biological noise in data. RESULTS: In this work we propose a multi-view integration methodology (named FH-Clust) for identifying patient subgroups from different omics information (e.g., Gene Expression, Mirna Expression, Methylation). In particular, hierarchical structures of patient data are obtained in each omic (or view) and finally their topologies are merged by consensus matrix. One of the main aspects of this methodology, is the use of a measure of dissimilarity between sets of observations, by using an appropriate metric. For each view, a dendrogram is obtained by using a hierarchical clustering based on a fuzzy equivalence relation with Łukasiewicz valued fuzzy similarity. Finally, a consensus matrix, that is a representative information of all dendrograms, is formed by combining multiple hierarchical agglomerations by an approach based on transitive consensus matrix construction. Several experiments and comparisons are made on real data (e.g., Glioblastoma, Prostate Cancer) to assess the proposed approach. CONCLUSIONS: Fuzzy logic allows us to introduce more flexible data agglomeration techniques. From the analysis of scientific literature, it appears to be the first time that a model based on fuzzy logic is used for the agglomeration of multi-omic data. The results suggest that FH-Clust provides better prognostic value and clinical significance compared to the analysis of single-omic data alone and it is very competitive with respect to other techniques from literature. BioMed Central 2020-08-21 /pmc/articles/PMC7446192/ /pubmed/32838739 http://dx.doi.org/10.1186/s12859-020-03567-6 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle	Research Ciaramella, Angelo Nardone, Davide Staiano, Antonino Data integration by fuzzy similarity-based hierarchical clustering
title	Data integration by fuzzy similarity-based hierarchical clustering
title_full	Data integration by fuzzy similarity-based hierarchical clustering
title_fullStr	Data integration by fuzzy similarity-based hierarchical clustering
title_full_unstemmed	Data integration by fuzzy similarity-based hierarchical clustering
title_short	Data integration by fuzzy similarity-based hierarchical clustering
title_sort	data integration by fuzzy similarity-based hierarchical clustering
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7446192/ https://www.ncbi.nlm.nih.gov/pubmed/32838739 http://dx.doi.org/10.1186/s12859-020-03567-6
work_keys_str_mv	AT ciaramellaangelo dataintegrationbyfuzzysimilaritybasedhierarchicalclustering AT nardonedavide dataintegrationbyfuzzysimilaritybasedhierarchicalclustering AT staianoantonino dataintegrationbyfuzzysimilaritybasedhierarchicalclustering

Data integration by fuzzy similarity-based hierarchical clustering

Ejemplares similares