Cargando…

A comparison of machine learning methods for classification using simulation with multiple real data examples from mental health studies

BACKGROUND: Recent literature on the comparison of machine learning methods has raised questions about the neutrality, unbiasedness and utility of many comparative studies. Reporting of results on favourable datasets and sampling error in the estimated performance measures based on single samples ar...

Descripción completa

Detalles Bibliográficos
Autores principales: Khondoker, Mizanur, Dobson, Richard, Skirrow, Caroline, Simmons, Andrew, Stahl, Daniel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: SAGE Publications 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5081132/
https://www.ncbi.nlm.nih.gov/pubmed/24047600
http://dx.doi.org/10.1177/0962280213502437
_version_ 1782462838814539776
author Khondoker, Mizanur
Dobson, Richard
Skirrow, Caroline
Simmons, Andrew
Stahl, Daniel
author_facet Khondoker, Mizanur
Dobson, Richard
Skirrow, Caroline
Simmons, Andrew
Stahl, Daniel
author_sort Khondoker, Mizanur
collection PubMed
description BACKGROUND: Recent literature on the comparison of machine learning methods has raised questions about the neutrality, unbiasedness and utility of many comparative studies. Reporting of results on favourable datasets and sampling error in the estimated performance measures based on single samples are thought to be the major sources of bias in such comparisons. Better performance in one or a few instances does not necessarily imply so on an average or on a population level and simulation studies may be a better alternative for objectively comparing the performances of machine learning algorithms. METHODS: We compare the classification performance of a number of important and widely used machine learning algorithms, namely the Random Forests (RF), Support Vector Machines (SVM), Linear Discriminant Analysis (LDA) and k-Nearest Neighbour (kNN). Using massively parallel processing on high-performance supercomputers, we compare the generalisation errors at various combinations of levels of several factors: number of features, training sample size, biological variation, experimental variation, effect size, replication and correlation between features. RESULTS: For smaller number of correlated features, number of features not exceeding approximately half the sample size, LDA was found to be the method of choice in terms of average generalisation errors as well as stability (precision) of error estimates. SVM (with RBF kernel) outperforms LDA as well as RF and kNN by a clear margin as the feature set gets larger provided the sample size is not too small (at least 20). The performance of kNN also improves as the number of features grows and outplays that of LDA and RF unless the data variability is too high and/or effect sizes are too small. RF was found to outperform only kNN in some instances where the data are more variable and have smaller effect sizes, in which cases it also provide more stable error estimates than kNN and LDA. Applications to a number of real datasets supported the findings from the simulation study.
format Online
Article
Text
id pubmed-5081132
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher SAGE Publications
record_format MEDLINE/PubMed
spelling pubmed-50811322017-03-10 A comparison of machine learning methods for classification using simulation with multiple real data examples from mental health studies Khondoker, Mizanur Dobson, Richard Skirrow, Caroline Simmons, Andrew Stahl, Daniel Stat Methods Med Res Articles BACKGROUND: Recent literature on the comparison of machine learning methods has raised questions about the neutrality, unbiasedness and utility of many comparative studies. Reporting of results on favourable datasets and sampling error in the estimated performance measures based on single samples are thought to be the major sources of bias in such comparisons. Better performance in one or a few instances does not necessarily imply so on an average or on a population level and simulation studies may be a better alternative for objectively comparing the performances of machine learning algorithms. METHODS: We compare the classification performance of a number of important and widely used machine learning algorithms, namely the Random Forests (RF), Support Vector Machines (SVM), Linear Discriminant Analysis (LDA) and k-Nearest Neighbour (kNN). Using massively parallel processing on high-performance supercomputers, we compare the generalisation errors at various combinations of levels of several factors: number of features, training sample size, biological variation, experimental variation, effect size, replication and correlation between features. RESULTS: For smaller number of correlated features, number of features not exceeding approximately half the sample size, LDA was found to be the method of choice in terms of average generalisation errors as well as stability (precision) of error estimates. SVM (with RBF kernel) outperforms LDA as well as RF and kNN by a clear margin as the feature set gets larger provided the sample size is not too small (at least 20). The performance of kNN also improves as the number of features grows and outplays that of LDA and RF unless the data variability is too high and/or effect sizes are too small. RF was found to outperform only kNN in some instances where the data are more variable and have smaller effect sizes, in which cases it also provide more stable error estimates than kNN and LDA. Applications to a number of real datasets supported the findings from the simulation study. SAGE Publications 2013-09-18 2016-10 /pmc/articles/PMC5081132/ /pubmed/24047600 http://dx.doi.org/10.1177/0962280213502437 Text en © The Author(s) 2013 http://creativecommons.org/licenses/by/3.0/ This article is distributed under the terms of the Creative Commons Attribution 3.0 License (http://www.creativecommons.org/licenses/by/3.0/) which permits any use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access page (https://us.sagepub.com/en-us/nam/open-access-at-sage).
spellingShingle Articles
Khondoker, Mizanur
Dobson, Richard
Skirrow, Caroline
Simmons, Andrew
Stahl, Daniel
A comparison of machine learning methods for classification using simulation with multiple real data examples from mental health studies
title A comparison of machine learning methods for classification using simulation with multiple real data examples from mental health studies
title_full A comparison of machine learning methods for classification using simulation with multiple real data examples from mental health studies
title_fullStr A comparison of machine learning methods for classification using simulation with multiple real data examples from mental health studies
title_full_unstemmed A comparison of machine learning methods for classification using simulation with multiple real data examples from mental health studies
title_short A comparison of machine learning methods for classification using simulation with multiple real data examples from mental health studies
title_sort comparison of machine learning methods for classification using simulation with multiple real data examples from mental health studies
topic Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5081132/
https://www.ncbi.nlm.nih.gov/pubmed/24047600
http://dx.doi.org/10.1177/0962280213502437
work_keys_str_mv AT khondokermizanur acomparisonofmachinelearningmethodsforclassificationusingsimulationwithmultiplerealdataexamplesfrommentalhealthstudies
AT dobsonrichard acomparisonofmachinelearningmethodsforclassificationusingsimulationwithmultiplerealdataexamplesfrommentalhealthstudies
AT skirrowcaroline acomparisonofmachinelearningmethodsforclassificationusingsimulationwithmultiplerealdataexamplesfrommentalhealthstudies
AT simmonsandrew acomparisonofmachinelearningmethodsforclassificationusingsimulationwithmultiplerealdataexamplesfrommentalhealthstudies
AT stahldaniel acomparisonofmachinelearningmethodsforclassificationusingsimulationwithmultiplerealdataexamplesfrommentalhealthstudies
AT khondokermizanur comparisonofmachinelearningmethodsforclassificationusingsimulationwithmultiplerealdataexamplesfrommentalhealthstudies
AT dobsonrichard comparisonofmachinelearningmethodsforclassificationusingsimulationwithmultiplerealdataexamplesfrommentalhealthstudies
AT skirrowcaroline comparisonofmachinelearningmethodsforclassificationusingsimulationwithmultiplerealdataexamplesfrommentalhealthstudies
AT simmonsandrew comparisonofmachinelearningmethodsforclassificationusingsimulationwithmultiplerealdataexamplesfrommentalhealthstudies
AT stahldaniel comparisonofmachinelearningmethodsforclassificationusingsimulationwithmultiplerealdataexamplesfrommentalhealthstudies