Cargando…

Random Bits Forest: a Strong Classifier/Regressor for Big Data

Efficiency, memory consumption, and robustness are common problems with many popular methods for data analysis. As a solution, we present Random Bits Forest (RBF), a classification and regression algorithm that integrates neural networks (for depth), boosting (for width), and random forests (for pre...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Yi, Li, Yi, Pu, Weilin, Wen, Kathryn, Shugart, Yin Yao, Xiong, Momiao, Jin, Li
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4957112/
https://www.ncbi.nlm.nih.gov/pubmed/27444562
http://dx.doi.org/10.1038/srep30086
_version_ 1782444126585749504
author Wang, Yi
Li, Yi
Pu, Weilin
Wen, Kathryn
Shugart, Yin Yao
Xiong, Momiao
Jin, Li
author_facet Wang, Yi
Li, Yi
Pu, Weilin
Wen, Kathryn
Shugart, Yin Yao
Xiong, Momiao
Jin, Li
author_sort Wang, Yi
collection PubMed
description Efficiency, memory consumption, and robustness are common problems with many popular methods for data analysis. As a solution, we present Random Bits Forest (RBF), a classification and regression algorithm that integrates neural networks (for depth), boosting (for width), and random forests (for prediction accuracy). Through a gradient boosting scheme, it first generates and selects ~10,000 small, 3-layer random neural networks. These networks are then fed into a modified random forest algorithm to obtain predictions. Testing with datasets from the UCI (University of California, Irvine) Machine Learning Repository shows that RBF outperforms other popular methods in both accuracy and robustness, especially with large datasets (N > 1000). The algorithm also performed highly in testing with an independent data set, a real psoriasis genome-wide association study (GWAS).
format Online
Article
Text
id pubmed-4957112
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-49571122016-07-26 Random Bits Forest: a Strong Classifier/Regressor for Big Data Wang, Yi Li, Yi Pu, Weilin Wen, Kathryn Shugart, Yin Yao Xiong, Momiao Jin, Li Sci Rep Article Efficiency, memory consumption, and robustness are common problems with many popular methods for data analysis. As a solution, we present Random Bits Forest (RBF), a classification and regression algorithm that integrates neural networks (for depth), boosting (for width), and random forests (for prediction accuracy). Through a gradient boosting scheme, it first generates and selects ~10,000 small, 3-layer random neural networks. These networks are then fed into a modified random forest algorithm to obtain predictions. Testing with datasets from the UCI (University of California, Irvine) Machine Learning Repository shows that RBF outperforms other popular methods in both accuracy and robustness, especially with large datasets (N > 1000). The algorithm also performed highly in testing with an independent data set, a real psoriasis genome-wide association study (GWAS). Nature Publishing Group 2016-07-22 /pmc/articles/PMC4957112/ /pubmed/27444562 http://dx.doi.org/10.1038/srep30086 Text en Copyright © 2016, Macmillan Publishers Limited http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Wang, Yi
Li, Yi
Pu, Weilin
Wen, Kathryn
Shugart, Yin Yao
Xiong, Momiao
Jin, Li
Random Bits Forest: a Strong Classifier/Regressor for Big Data
title Random Bits Forest: a Strong Classifier/Regressor for Big Data
title_full Random Bits Forest: a Strong Classifier/Regressor for Big Data
title_fullStr Random Bits Forest: a Strong Classifier/Regressor for Big Data
title_full_unstemmed Random Bits Forest: a Strong Classifier/Regressor for Big Data
title_short Random Bits Forest: a Strong Classifier/Regressor for Big Data
title_sort random bits forest: a strong classifier/regressor for big data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4957112/
https://www.ncbi.nlm.nih.gov/pubmed/27444562
http://dx.doi.org/10.1038/srep30086
work_keys_str_mv AT wangyi randombitsforestastrongclassifierregressorforbigdata
AT liyi randombitsforestastrongclassifierregressorforbigdata
AT puweilin randombitsforestastrongclassifierregressorforbigdata
AT wenkathryn randombitsforestastrongclassifierregressorforbigdata
AT shugartyinyao randombitsforestastrongclassifierregressorforbigdata
AT xiongmomiao randombitsforestastrongclassifierregressorforbigdata
AT jinli randombitsforestastrongclassifierregressorforbigdata