Cargando…
Supervised classification in the presence of misclassified training data: a Monte Carlo simulation study in the three group case
Statistical classification of phenomena into observed groups is very common in the social and behavioral sciences. Statistical classification methods, however, are affected by the characteristics of the data under study. Statistical classification can be further complicated by initial misclassificat...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3937587/ https://www.ncbi.nlm.nih.gov/pubmed/24616711 http://dx.doi.org/10.3389/fpsyg.2014.00118 |
_version_ | 1782305522739838976 |
---|---|
author | Bolin, Jocelyn Holden Finch, W. Holmes |
author_facet | Bolin, Jocelyn Holden Finch, W. Holmes |
author_sort | Bolin, Jocelyn Holden |
collection | PubMed |
description | Statistical classification of phenomena into observed groups is very common in the social and behavioral sciences. Statistical classification methods, however, are affected by the characteristics of the data under study. Statistical classification can be further complicated by initial misclassification of the observed groups. The purpose of this study is to investigate the impact of initial training data misclassification on several statistical classification and data mining techniques. Misclassification conditions in the three group case will be simulated and results will be presented in terms of overall as well as subgroup classification accuracy. Results show decreased classification accuracy as sample size, group separation and group size ratio decrease and as misclassification percentage increases with random forests demonstrating the highest accuracy across conditions. |
format | Online Article Text |
id | pubmed-3937587 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-39375872014-03-10 Supervised classification in the presence of misclassified training data: a Monte Carlo simulation study in the three group case Bolin, Jocelyn Holden Finch, W. Holmes Front Psychol Psychology Statistical classification of phenomena into observed groups is very common in the social and behavioral sciences. Statistical classification methods, however, are affected by the characteristics of the data under study. Statistical classification can be further complicated by initial misclassification of the observed groups. The purpose of this study is to investigate the impact of initial training data misclassification on several statistical classification and data mining techniques. Misclassification conditions in the three group case will be simulated and results will be presented in terms of overall as well as subgroup classification accuracy. Results show decreased classification accuracy as sample size, group separation and group size ratio decrease and as misclassification percentage increases with random forests demonstrating the highest accuracy across conditions. Frontiers Media S.A. 2014-02-28 /pmc/articles/PMC3937587/ /pubmed/24616711 http://dx.doi.org/10.3389/fpsyg.2014.00118 Text en Copyright © 2014 Bolin and Finch. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Psychology Bolin, Jocelyn Holden Finch, W. Holmes Supervised classification in the presence of misclassified training data: a Monte Carlo simulation study in the three group case |
title | Supervised classification in the presence of misclassified training data: a Monte Carlo simulation study in the three group case |
title_full | Supervised classification in the presence of misclassified training data: a Monte Carlo simulation study in the three group case |
title_fullStr | Supervised classification in the presence of misclassified training data: a Monte Carlo simulation study in the three group case |
title_full_unstemmed | Supervised classification in the presence of misclassified training data: a Monte Carlo simulation study in the three group case |
title_short | Supervised classification in the presence of misclassified training data: a Monte Carlo simulation study in the three group case |
title_sort | supervised classification in the presence of misclassified training data: a monte carlo simulation study in the three group case |
topic | Psychology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3937587/ https://www.ncbi.nlm.nih.gov/pubmed/24616711 http://dx.doi.org/10.3389/fpsyg.2014.00118 |
work_keys_str_mv | AT bolinjocelynholden supervisedclassificationinthepresenceofmisclassifiedtrainingdataamontecarlosimulationstudyinthethreegroupcase AT finchwholmes supervisedclassificationinthepresenceofmisclassifiedtrainingdataamontecarlosimulationstudyinthethreegroupcase |