Cargando…

An improved algorithm for the maximal information coefficient and its application

The maximal information coefficient (MIC) captures both linear and nonlinear correlations between variable pairs. In this paper, we proposed the BackMIC algorithm for MIC estimation. The BackMIC algorithm adds a searching back process on the equipartitioned axis to obtain a better grid partition tha...

Descripción completa

Detalles Bibliográficos
Autores principales: Cao, Dan, Chen, Yuan, Chen, Jin, Zhang, Hongyan, Yuan, Zheming
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Royal Society 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8074658/
https://www.ncbi.nlm.nih.gov/pubmed/33972855
http://dx.doi.org/10.1098/rsos.201424
_version_ 1783684392676229120
author Cao, Dan
Chen, Yuan
Chen, Jin
Zhang, Hongyan
Yuan, Zheming
author_facet Cao, Dan
Chen, Yuan
Chen, Jin
Zhang, Hongyan
Yuan, Zheming
author_sort Cao, Dan
collection PubMed
description The maximal information coefficient (MIC) captures both linear and nonlinear correlations between variable pairs. In this paper, we proposed the BackMIC algorithm for MIC estimation. The BackMIC algorithm adds a searching back process on the equipartitioned axis to obtain a better grid partition than the original implementation algorithm ApproxMaxMI. And similar to the ChiMIC algorithm, it terminates the grid search process by the χ(2)-test instead of the maximum number of bins B(n, α). Results on simulated data show that the BackMIC algorithm maintains the generality of MIC, and gives more reasonable grid partition and MIC values for independent and dependent variable pairs under comparable running times. Moreover, it is robust under different α in B(n, α). MIC calculated by the BackMIC algorithm reveals an improvement in statistical power and equitability. We applied (1-MIC) as the distance measurement in the K-means algorithm to perform a clustering of the cancer/normal samples. The results on four cancer datasets demonstrated that the MIC values calculated by the BackMIC algorithm can obtain better clustering results, indicating the correlations between samples measured by the BackMIC algorithm were more credible than those measured by other algorithms.
format Online
Article
Text
id pubmed-8074658
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher The Royal Society
record_format MEDLINE/PubMed
spelling pubmed-80746582021-05-09 An improved algorithm for the maximal information coefficient and its application Cao, Dan Chen, Yuan Chen, Jin Zhang, Hongyan Yuan, Zheming R Soc Open Sci Mathematics The maximal information coefficient (MIC) captures both linear and nonlinear correlations between variable pairs. In this paper, we proposed the BackMIC algorithm for MIC estimation. The BackMIC algorithm adds a searching back process on the equipartitioned axis to obtain a better grid partition than the original implementation algorithm ApproxMaxMI. And similar to the ChiMIC algorithm, it terminates the grid search process by the χ(2)-test instead of the maximum number of bins B(n, α). Results on simulated data show that the BackMIC algorithm maintains the generality of MIC, and gives more reasonable grid partition and MIC values for independent and dependent variable pairs under comparable running times. Moreover, it is robust under different α in B(n, α). MIC calculated by the BackMIC algorithm reveals an improvement in statistical power and equitability. We applied (1-MIC) as the distance measurement in the K-means algorithm to perform a clustering of the cancer/normal samples. The results on four cancer datasets demonstrated that the MIC values calculated by the BackMIC algorithm can obtain better clustering results, indicating the correlations between samples measured by the BackMIC algorithm were more credible than those measured by other algorithms. The Royal Society 2021-02-10 /pmc/articles/PMC8074658/ /pubmed/33972855 http://dx.doi.org/10.1098/rsos.201424 Text en © 2021 The Authors. https://creativecommons.org/licenses/by/4.0/Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, provided the original author and source are credited.
spellingShingle Mathematics
Cao, Dan
Chen, Yuan
Chen, Jin
Zhang, Hongyan
Yuan, Zheming
An improved algorithm for the maximal information coefficient and its application
title An improved algorithm for the maximal information coefficient and its application
title_full An improved algorithm for the maximal information coefficient and its application
title_fullStr An improved algorithm for the maximal information coefficient and its application
title_full_unstemmed An improved algorithm for the maximal information coefficient and its application
title_short An improved algorithm for the maximal information coefficient and its application
title_sort improved algorithm for the maximal information coefficient and its application
topic Mathematics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8074658/
https://www.ncbi.nlm.nih.gov/pubmed/33972855
http://dx.doi.org/10.1098/rsos.201424
work_keys_str_mv AT caodan animprovedalgorithmforthemaximalinformationcoefficientanditsapplication
AT chenyuan animprovedalgorithmforthemaximalinformationcoefficientanditsapplication
AT chenjin animprovedalgorithmforthemaximalinformationcoefficientanditsapplication
AT zhanghongyan animprovedalgorithmforthemaximalinformationcoefficientanditsapplication
AT yuanzheming animprovedalgorithmforthemaximalinformationcoefficientanditsapplication
AT caodan improvedalgorithmforthemaximalinformationcoefficientanditsapplication
AT chenyuan improvedalgorithmforthemaximalinformationcoefficientanditsapplication
AT chenjin improvedalgorithmforthemaximalinformationcoefficientanditsapplication
AT zhanghongyan improvedalgorithmforthemaximalinformationcoefficientanditsapplication
AT yuanzheming improvedalgorithmforthemaximalinformationcoefficientanditsapplication