Cargando…
An improved algorithm for the maximal information coefficient and its application
The maximal information coefficient (MIC) captures both linear and nonlinear correlations between variable pairs. In this paper, we proposed the BackMIC algorithm for MIC estimation. The BackMIC algorithm adds a searching back process on the equipartitioned axis to obtain a better grid partition tha...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
The Royal Society
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8074658/ https://www.ncbi.nlm.nih.gov/pubmed/33972855 http://dx.doi.org/10.1098/rsos.201424 |
_version_ | 1783684392676229120 |
---|---|
author | Cao, Dan Chen, Yuan Chen, Jin Zhang, Hongyan Yuan, Zheming |
author_facet | Cao, Dan Chen, Yuan Chen, Jin Zhang, Hongyan Yuan, Zheming |
author_sort | Cao, Dan |
collection | PubMed |
description | The maximal information coefficient (MIC) captures both linear and nonlinear correlations between variable pairs. In this paper, we proposed the BackMIC algorithm for MIC estimation. The BackMIC algorithm adds a searching back process on the equipartitioned axis to obtain a better grid partition than the original implementation algorithm ApproxMaxMI. And similar to the ChiMIC algorithm, it terminates the grid search process by the χ(2)-test instead of the maximum number of bins B(n, α). Results on simulated data show that the BackMIC algorithm maintains the generality of MIC, and gives more reasonable grid partition and MIC values for independent and dependent variable pairs under comparable running times. Moreover, it is robust under different α in B(n, α). MIC calculated by the BackMIC algorithm reveals an improvement in statistical power and equitability. We applied (1-MIC) as the distance measurement in the K-means algorithm to perform a clustering of the cancer/normal samples. The results on four cancer datasets demonstrated that the MIC values calculated by the BackMIC algorithm can obtain better clustering results, indicating the correlations between samples measured by the BackMIC algorithm were more credible than those measured by other algorithms. |
format | Online Article Text |
id | pubmed-8074658 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | The Royal Society |
record_format | MEDLINE/PubMed |
spelling | pubmed-80746582021-05-09 An improved algorithm for the maximal information coefficient and its application Cao, Dan Chen, Yuan Chen, Jin Zhang, Hongyan Yuan, Zheming R Soc Open Sci Mathematics The maximal information coefficient (MIC) captures both linear and nonlinear correlations between variable pairs. In this paper, we proposed the BackMIC algorithm for MIC estimation. The BackMIC algorithm adds a searching back process on the equipartitioned axis to obtain a better grid partition than the original implementation algorithm ApproxMaxMI. And similar to the ChiMIC algorithm, it terminates the grid search process by the χ(2)-test instead of the maximum number of bins B(n, α). Results on simulated data show that the BackMIC algorithm maintains the generality of MIC, and gives more reasonable grid partition and MIC values for independent and dependent variable pairs under comparable running times. Moreover, it is robust under different α in B(n, α). MIC calculated by the BackMIC algorithm reveals an improvement in statistical power and equitability. We applied (1-MIC) as the distance measurement in the K-means algorithm to perform a clustering of the cancer/normal samples. The results on four cancer datasets demonstrated that the MIC values calculated by the BackMIC algorithm can obtain better clustering results, indicating the correlations between samples measured by the BackMIC algorithm were more credible than those measured by other algorithms. The Royal Society 2021-02-10 /pmc/articles/PMC8074658/ /pubmed/33972855 http://dx.doi.org/10.1098/rsos.201424 Text en © 2021 The Authors. https://creativecommons.org/licenses/by/4.0/Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, provided the original author and source are credited. |
spellingShingle | Mathematics Cao, Dan Chen, Yuan Chen, Jin Zhang, Hongyan Yuan, Zheming An improved algorithm for the maximal information coefficient and its application |
title | An improved algorithm for the maximal information coefficient and its application |
title_full | An improved algorithm for the maximal information coefficient and its application |
title_fullStr | An improved algorithm for the maximal information coefficient and its application |
title_full_unstemmed | An improved algorithm for the maximal information coefficient and its application |
title_short | An improved algorithm for the maximal information coefficient and its application |
title_sort | improved algorithm for the maximal information coefficient and its application |
topic | Mathematics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8074658/ https://www.ncbi.nlm.nih.gov/pubmed/33972855 http://dx.doi.org/10.1098/rsos.201424 |
work_keys_str_mv | AT caodan animprovedalgorithmforthemaximalinformationcoefficientanditsapplication AT chenyuan animprovedalgorithmforthemaximalinformationcoefficientanditsapplication AT chenjin animprovedalgorithmforthemaximalinformationcoefficientanditsapplication AT zhanghongyan animprovedalgorithmforthemaximalinformationcoefficientanditsapplication AT yuanzheming animprovedalgorithmforthemaximalinformationcoefficientanditsapplication AT caodan improvedalgorithmforthemaximalinformationcoefficientanditsapplication AT chenyuan improvedalgorithmforthemaximalinformationcoefficientanditsapplication AT chenjin improvedalgorithmforthemaximalinformationcoefficientanditsapplication AT zhanghongyan improvedalgorithmforthemaximalinformationcoefficientanditsapplication AT yuanzheming improvedalgorithmforthemaximalinformationcoefficientanditsapplication |