Cargando…

Bagging Nearest-Neighbor Prediction independence Test: an efficient method for nonlinear dependence of two continuous variables

Testing dependence/correlation of two variables is one of the fundamental tasks in statistics. In this work, we proposed an efficient method for nonlinear dependence of two continuous variables (X and Y). We addressed this research question by using BNNPT (Bagging Nearest-Neighbor Prediction indepen...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Yi, Li, Yi, Liu, Xiaoyu, Pu, Weilin, Wang, Xiaofeng, Wang, Jiucun, Xiong, Momiao, Yao Shugart, Yin, Jin, Li
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5630623/
https://www.ncbi.nlm.nih.gov/pubmed/28986523
http://dx.doi.org/10.1038/s41598-017-12783-9
_version_ 1783269256709799936
author Wang, Yi
Li, Yi
Liu, Xiaoyu
Pu, Weilin
Wang, Xiaofeng
Wang, Jiucun
Xiong, Momiao
Yao Shugart, Yin
Jin, Li
author_facet Wang, Yi
Li, Yi
Liu, Xiaoyu
Pu, Weilin
Wang, Xiaofeng
Wang, Jiucun
Xiong, Momiao
Yao Shugart, Yin
Jin, Li
author_sort Wang, Yi
collection PubMed
description Testing dependence/correlation of two variables is one of the fundamental tasks in statistics. In this work, we proposed an efficient method for nonlinear dependence of two continuous variables (X and Y). We addressed this research question by using BNNPT (Bagging Nearest-Neighbor Prediction independence Test, software available at https://sourceforge.net/projects/bnnpt/). In the BNNPT framework, we first used the value of X to construct a bagging neighborhood structure. We then obtained the out of bag estimator of Y based on the bagging neighborhood structure. The square error was calculated to measure how well Y is predicted by X. Finally, a permutation test was applied to determine the significance of the observed square error. To evaluate the strength of BNNPT compared to seven other methods, we performed extensive simulations to explore the relationship between various methods and compared the false positive rates and statistical power using both simulated and real datasets (Rugao longevity cohort mitochondrial DNA haplogroups and kidney cancer RNA-seq datasets). We concluded that BNNPT is an efficient computational approach to test nonlinear correlation in real world applications.
format Online
Article
Text
id pubmed-5630623
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-56306232017-10-17 Bagging Nearest-Neighbor Prediction independence Test: an efficient method for nonlinear dependence of two continuous variables Wang, Yi Li, Yi Liu, Xiaoyu Pu, Weilin Wang, Xiaofeng Wang, Jiucun Xiong, Momiao Yao Shugart, Yin Jin, Li Sci Rep Article Testing dependence/correlation of two variables is one of the fundamental tasks in statistics. In this work, we proposed an efficient method for nonlinear dependence of two continuous variables (X and Y). We addressed this research question by using BNNPT (Bagging Nearest-Neighbor Prediction independence Test, software available at https://sourceforge.net/projects/bnnpt/). In the BNNPT framework, we first used the value of X to construct a bagging neighborhood structure. We then obtained the out of bag estimator of Y based on the bagging neighborhood structure. The square error was calculated to measure how well Y is predicted by X. Finally, a permutation test was applied to determine the significance of the observed square error. To evaluate the strength of BNNPT compared to seven other methods, we performed extensive simulations to explore the relationship between various methods and compared the false positive rates and statistical power using both simulated and real datasets (Rugao longevity cohort mitochondrial DNA haplogroups and kidney cancer RNA-seq datasets). We concluded that BNNPT is an efficient computational approach to test nonlinear correlation in real world applications. Nature Publishing Group UK 2017-10-06 /pmc/articles/PMC5630623/ /pubmed/28986523 http://dx.doi.org/10.1038/s41598-017-12783-9 Text en © The Author(s) 2017 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Wang, Yi
Li, Yi
Liu, Xiaoyu
Pu, Weilin
Wang, Xiaofeng
Wang, Jiucun
Xiong, Momiao
Yao Shugart, Yin
Jin, Li
Bagging Nearest-Neighbor Prediction independence Test: an efficient method for nonlinear dependence of two continuous variables
title Bagging Nearest-Neighbor Prediction independence Test: an efficient method for nonlinear dependence of two continuous variables
title_full Bagging Nearest-Neighbor Prediction independence Test: an efficient method for nonlinear dependence of two continuous variables
title_fullStr Bagging Nearest-Neighbor Prediction independence Test: an efficient method for nonlinear dependence of two continuous variables
title_full_unstemmed Bagging Nearest-Neighbor Prediction independence Test: an efficient method for nonlinear dependence of two continuous variables
title_short Bagging Nearest-Neighbor Prediction independence Test: an efficient method for nonlinear dependence of two continuous variables
title_sort bagging nearest-neighbor prediction independence test: an efficient method for nonlinear dependence of two continuous variables
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5630623/
https://www.ncbi.nlm.nih.gov/pubmed/28986523
http://dx.doi.org/10.1038/s41598-017-12783-9
work_keys_str_mv AT wangyi baggingnearestneighborpredictionindependencetestanefficientmethodfornonlineardependenceoftwocontinuousvariables
AT liyi baggingnearestneighborpredictionindependencetestanefficientmethodfornonlineardependenceoftwocontinuousvariables
AT liuxiaoyu baggingnearestneighborpredictionindependencetestanefficientmethodfornonlineardependenceoftwocontinuousvariables
AT puweilin baggingnearestneighborpredictionindependencetestanefficientmethodfornonlineardependenceoftwocontinuousvariables
AT wangxiaofeng baggingnearestneighborpredictionindependencetestanefficientmethodfornonlineardependenceoftwocontinuousvariables
AT wangjiucun baggingnearestneighborpredictionindependencetestanefficientmethodfornonlineardependenceoftwocontinuousvariables
AT xiongmomiao baggingnearestneighborpredictionindependencetestanefficientmethodfornonlineardependenceoftwocontinuousvariables
AT yaoshugartyin baggingnearestneighborpredictionindependencetestanefficientmethodfornonlineardependenceoftwocontinuousvariables
AT jinli baggingnearestneighborpredictionindependencetestanefficientmethodfornonlineardependenceoftwocontinuousvariables