Cargando…

Decision tree methods: applications for classification and prediction

Decision tree methodology is a commonly used data mining method for establishing classification systems based on multiple covariates or for developing prediction algorithms for a target variable. This method classifies a population into branch-like segments that construct an inverted tree with a roo...

Descripción completa

Detalles Bibliográficos
Autores principales: SONG, Yan-yan, LU, Ying
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Shanghai Municipal Bureau of Publishing 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4466856/
https://www.ncbi.nlm.nih.gov/pubmed/26120265
http://dx.doi.org/10.11919/j.issn.1002-0829.215044
Descripción
Sumario:Decision tree methodology is a commonly used data mining method for establishing classification systems based on multiple covariates or for developing prediction algorithms for a target variable. This method classifies a population into branch-like segments that construct an inverted tree with a root node, internal nodes, and leaf nodes. The algorithm is non-parametric and can efficiently deal with large, complicated datasets without imposing a complicated parametric structure. When the sample size is large enough, study data can be divided into training and validation datasets. Using the training dataset to build a decision tree model and a validation dataset to decide on the appropriate tree size needed to achieve the optimal final model. This paper introduces frequently used algorithms used to develop decision trees (including CART, C4.5, CHAID, and QUEST) and describes the SPSS and SAS programs that can be used to visualize tree structure.