Cargando…
Bagging Nearest-Neighbor Prediction independence Test: an efficient method for nonlinear dependence of two continuous variables
Testing dependence/correlation of two variables is one of the fundamental tasks in statistics. In this work, we proposed an efficient method for nonlinear dependence of two continuous variables (X and Y). We addressed this research question by using BNNPT (Bagging Nearest-Neighbor Prediction indepen...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5630623/ https://www.ncbi.nlm.nih.gov/pubmed/28986523 http://dx.doi.org/10.1038/s41598-017-12783-9 |
_version_ | 1783269256709799936 |
---|---|
author | Wang, Yi Li, Yi Liu, Xiaoyu Pu, Weilin Wang, Xiaofeng Wang, Jiucun Xiong, Momiao Yao Shugart, Yin Jin, Li |
author_facet | Wang, Yi Li, Yi Liu, Xiaoyu Pu, Weilin Wang, Xiaofeng Wang, Jiucun Xiong, Momiao Yao Shugart, Yin Jin, Li |
author_sort | Wang, Yi |
collection | PubMed |
description | Testing dependence/correlation of two variables is one of the fundamental tasks in statistics. In this work, we proposed an efficient method for nonlinear dependence of two continuous variables (X and Y). We addressed this research question by using BNNPT (Bagging Nearest-Neighbor Prediction independence Test, software available at https://sourceforge.net/projects/bnnpt/). In the BNNPT framework, we first used the value of X to construct a bagging neighborhood structure. We then obtained the out of bag estimator of Y based on the bagging neighborhood structure. The square error was calculated to measure how well Y is predicted by X. Finally, a permutation test was applied to determine the significance of the observed square error. To evaluate the strength of BNNPT compared to seven other methods, we performed extensive simulations to explore the relationship between various methods and compared the false positive rates and statistical power using both simulated and real datasets (Rugao longevity cohort mitochondrial DNA haplogroups and kidney cancer RNA-seq datasets). We concluded that BNNPT is an efficient computational approach to test nonlinear correlation in real world applications. |
format | Online Article Text |
id | pubmed-5630623 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-56306232017-10-17 Bagging Nearest-Neighbor Prediction independence Test: an efficient method for nonlinear dependence of two continuous variables Wang, Yi Li, Yi Liu, Xiaoyu Pu, Weilin Wang, Xiaofeng Wang, Jiucun Xiong, Momiao Yao Shugart, Yin Jin, Li Sci Rep Article Testing dependence/correlation of two variables is one of the fundamental tasks in statistics. In this work, we proposed an efficient method for nonlinear dependence of two continuous variables (X and Y). We addressed this research question by using BNNPT (Bagging Nearest-Neighbor Prediction independence Test, software available at https://sourceforge.net/projects/bnnpt/). In the BNNPT framework, we first used the value of X to construct a bagging neighborhood structure. We then obtained the out of bag estimator of Y based on the bagging neighborhood structure. The square error was calculated to measure how well Y is predicted by X. Finally, a permutation test was applied to determine the significance of the observed square error. To evaluate the strength of BNNPT compared to seven other methods, we performed extensive simulations to explore the relationship between various methods and compared the false positive rates and statistical power using both simulated and real datasets (Rugao longevity cohort mitochondrial DNA haplogroups and kidney cancer RNA-seq datasets). We concluded that BNNPT is an efficient computational approach to test nonlinear correlation in real world applications. Nature Publishing Group UK 2017-10-06 /pmc/articles/PMC5630623/ /pubmed/28986523 http://dx.doi.org/10.1038/s41598-017-12783-9 Text en © The Author(s) 2017 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Wang, Yi Li, Yi Liu, Xiaoyu Pu, Weilin Wang, Xiaofeng Wang, Jiucun Xiong, Momiao Yao Shugart, Yin Jin, Li Bagging Nearest-Neighbor Prediction independence Test: an efficient method for nonlinear dependence of two continuous variables |
title | Bagging Nearest-Neighbor Prediction independence Test: an efficient method for nonlinear dependence of two continuous variables |
title_full | Bagging Nearest-Neighbor Prediction independence Test: an efficient method for nonlinear dependence of two continuous variables |
title_fullStr | Bagging Nearest-Neighbor Prediction independence Test: an efficient method for nonlinear dependence of two continuous variables |
title_full_unstemmed | Bagging Nearest-Neighbor Prediction independence Test: an efficient method for nonlinear dependence of two continuous variables |
title_short | Bagging Nearest-Neighbor Prediction independence Test: an efficient method for nonlinear dependence of two continuous variables |
title_sort | bagging nearest-neighbor prediction independence test: an efficient method for nonlinear dependence of two continuous variables |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5630623/ https://www.ncbi.nlm.nih.gov/pubmed/28986523 http://dx.doi.org/10.1038/s41598-017-12783-9 |
work_keys_str_mv | AT wangyi baggingnearestneighborpredictionindependencetestanefficientmethodfornonlineardependenceoftwocontinuousvariables AT liyi baggingnearestneighborpredictionindependencetestanefficientmethodfornonlineardependenceoftwocontinuousvariables AT liuxiaoyu baggingnearestneighborpredictionindependencetestanefficientmethodfornonlineardependenceoftwocontinuousvariables AT puweilin baggingnearestneighborpredictionindependencetestanefficientmethodfornonlineardependenceoftwocontinuousvariables AT wangxiaofeng baggingnearestneighborpredictionindependencetestanefficientmethodfornonlineardependenceoftwocontinuousvariables AT wangjiucun baggingnearestneighborpredictionindependencetestanefficientmethodfornonlineardependenceoftwocontinuousvariables AT xiongmomiao baggingnearestneighborpredictionindependencetestanefficientmethodfornonlineardependenceoftwocontinuousvariables AT yaoshugartyin baggingnearestneighborpredictionindependencetestanefficientmethodfornonlineardependenceoftwocontinuousvariables AT jinli baggingnearestneighborpredictionindependencetestanefficientmethodfornonlineardependenceoftwocontinuousvariables |