Cargando…
Random Bits Forest: a Strong Classifier/Regressor for Big Data
Efficiency, memory consumption, and robustness are common problems with many popular methods for data analysis. As a solution, we present Random Bits Forest (RBF), a classification and regression algorithm that integrates neural networks (for depth), boosting (for width), and random forests (for pre...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4957112/ https://www.ncbi.nlm.nih.gov/pubmed/27444562 http://dx.doi.org/10.1038/srep30086 |
_version_ | 1782444126585749504 |
---|---|
author | Wang, Yi Li, Yi Pu, Weilin Wen, Kathryn Shugart, Yin Yao Xiong, Momiao Jin, Li |
author_facet | Wang, Yi Li, Yi Pu, Weilin Wen, Kathryn Shugart, Yin Yao Xiong, Momiao Jin, Li |
author_sort | Wang, Yi |
collection | PubMed |
description | Efficiency, memory consumption, and robustness are common problems with many popular methods for data analysis. As a solution, we present Random Bits Forest (RBF), a classification and regression algorithm that integrates neural networks (for depth), boosting (for width), and random forests (for prediction accuracy). Through a gradient boosting scheme, it first generates and selects ~10,000 small, 3-layer random neural networks. These networks are then fed into a modified random forest algorithm to obtain predictions. Testing with datasets from the UCI (University of California, Irvine) Machine Learning Repository shows that RBF outperforms other popular methods in both accuracy and robustness, especially with large datasets (N > 1000). The algorithm also performed highly in testing with an independent data set, a real psoriasis genome-wide association study (GWAS). |
format | Online Article Text |
id | pubmed-4957112 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Nature Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-49571122016-07-26 Random Bits Forest: a Strong Classifier/Regressor for Big Data Wang, Yi Li, Yi Pu, Weilin Wen, Kathryn Shugart, Yin Yao Xiong, Momiao Jin, Li Sci Rep Article Efficiency, memory consumption, and robustness are common problems with many popular methods for data analysis. As a solution, we present Random Bits Forest (RBF), a classification and regression algorithm that integrates neural networks (for depth), boosting (for width), and random forests (for prediction accuracy). Through a gradient boosting scheme, it first generates and selects ~10,000 small, 3-layer random neural networks. These networks are then fed into a modified random forest algorithm to obtain predictions. Testing with datasets from the UCI (University of California, Irvine) Machine Learning Repository shows that RBF outperforms other popular methods in both accuracy and robustness, especially with large datasets (N > 1000). The algorithm also performed highly in testing with an independent data set, a real psoriasis genome-wide association study (GWAS). Nature Publishing Group 2016-07-22 /pmc/articles/PMC4957112/ /pubmed/27444562 http://dx.doi.org/10.1038/srep30086 Text en Copyright © 2016, Macmillan Publishers Limited http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ |
spellingShingle | Article Wang, Yi Li, Yi Pu, Weilin Wen, Kathryn Shugart, Yin Yao Xiong, Momiao Jin, Li Random Bits Forest: a Strong Classifier/Regressor for Big Data |
title | Random Bits Forest: a Strong Classifier/Regressor for Big Data |
title_full | Random Bits Forest: a Strong Classifier/Regressor for Big Data |
title_fullStr | Random Bits Forest: a Strong Classifier/Regressor for Big Data |
title_full_unstemmed | Random Bits Forest: a Strong Classifier/Regressor for Big Data |
title_short | Random Bits Forest: a Strong Classifier/Regressor for Big Data |
title_sort | random bits forest: a strong classifier/regressor for big data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4957112/ https://www.ncbi.nlm.nih.gov/pubmed/27444562 http://dx.doi.org/10.1038/srep30086 |
work_keys_str_mv | AT wangyi randombitsforestastrongclassifierregressorforbigdata AT liyi randombitsforestastrongclassifierregressorforbigdata AT puweilin randombitsforestastrongclassifierregressorforbigdata AT wenkathryn randombitsforestastrongclassifierregressorforbigdata AT shugartyinyao randombitsforestastrongclassifierregressorforbigdata AT xiongmomiao randombitsforestastrongclassifierregressorforbigdata AT jinli randombitsforestastrongclassifierregressorforbigdata |