Cargando…

Comparing different supervised machine learning algorithms for disease prediction

BACKGROUND: Supervised machine learning algorithms have been a dominant method in the data mining field. Disease prediction using health data has recently shown a potential application area for these methods. This study ai7ms to identify the key trends among different types of supervised machine lea...

Descripción completa

Detalles Bibliográficos
Autores principales: Uddin, Shahadat, Khan, Arif, Hossain, Md Ekramul, Moni, Mohammad Ali
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6925840/
https://www.ncbi.nlm.nih.gov/pubmed/31864346
http://dx.doi.org/10.1186/s12911-019-1004-8
_version_ 1783481987620667392
author Uddin, Shahadat
Khan, Arif
Hossain, Md Ekramul
Moni, Mohammad Ali
author_facet Uddin, Shahadat
Khan, Arif
Hossain, Md Ekramul
Moni, Mohammad Ali
author_sort Uddin, Shahadat
collection PubMed
description BACKGROUND: Supervised machine learning algorithms have been a dominant method in the data mining field. Disease prediction using health data has recently shown a potential application area for these methods. This study ai7ms to identify the key trends among different types of supervised machine learning algorithms, and their performance and usage for disease risk prediction. METHODS: In this study, extensive research efforts were made to identify those studies that applied more than one supervised machine learning algorithm on single disease prediction. Two databases (i.e., Scopus and PubMed) were searched for different types of search items. Thus, we selected 48 articles in total for the comparison among variants supervised machine learning algorithms for disease prediction. RESULTS: We found that the Support Vector Machine (SVM) algorithm is applied most frequently (in 29 studies) followed by the Naïve Bayes algorithm (in 23 studies). However, the Random Forest (RF) algorithm showed superior accuracy comparatively. Of the 17 studies where it was applied, RF showed the highest accuracy in 9 of them, i.e., 53%. This was followed by SVM which topped in 41% of the studies it was considered. CONCLUSION: This study provides a wide overview of the relative performance of different variants of supervised machine learning algorithms for disease prediction. This important information of relative performance can be used to aid researchers in the selection of an appropriate supervised machine learning algorithm for their studies.
format Online
Article
Text
id pubmed-6925840
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-69258402019-12-30 Comparing different supervised machine learning algorithms for disease prediction Uddin, Shahadat Khan, Arif Hossain, Md Ekramul Moni, Mohammad Ali BMC Med Inform Decis Mak Research Article BACKGROUND: Supervised machine learning algorithms have been a dominant method in the data mining field. Disease prediction using health data has recently shown a potential application area for these methods. This study ai7ms to identify the key trends among different types of supervised machine learning algorithms, and their performance and usage for disease risk prediction. METHODS: In this study, extensive research efforts were made to identify those studies that applied more than one supervised machine learning algorithm on single disease prediction. Two databases (i.e., Scopus and PubMed) were searched for different types of search items. Thus, we selected 48 articles in total for the comparison among variants supervised machine learning algorithms for disease prediction. RESULTS: We found that the Support Vector Machine (SVM) algorithm is applied most frequently (in 29 studies) followed by the Naïve Bayes algorithm (in 23 studies). However, the Random Forest (RF) algorithm showed superior accuracy comparatively. Of the 17 studies where it was applied, RF showed the highest accuracy in 9 of them, i.e., 53%. This was followed by SVM which topped in 41% of the studies it was considered. CONCLUSION: This study provides a wide overview of the relative performance of different variants of supervised machine learning algorithms for disease prediction. This important information of relative performance can be used to aid researchers in the selection of an appropriate supervised machine learning algorithm for their studies. BioMed Central 2019-12-21 /pmc/articles/PMC6925840/ /pubmed/31864346 http://dx.doi.org/10.1186/s12911-019-1004-8 Text en © The Author(s). 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Uddin, Shahadat
Khan, Arif
Hossain, Md Ekramul
Moni, Mohammad Ali
Comparing different supervised machine learning algorithms for disease prediction
title Comparing different supervised machine learning algorithms for disease prediction
title_full Comparing different supervised machine learning algorithms for disease prediction
title_fullStr Comparing different supervised machine learning algorithms for disease prediction
title_full_unstemmed Comparing different supervised machine learning algorithms for disease prediction
title_short Comparing different supervised machine learning algorithms for disease prediction
title_sort comparing different supervised machine learning algorithms for disease prediction
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6925840/
https://www.ncbi.nlm.nih.gov/pubmed/31864346
http://dx.doi.org/10.1186/s12911-019-1004-8
work_keys_str_mv AT uddinshahadat comparingdifferentsupervisedmachinelearningalgorithmsfordiseaseprediction
AT khanarif comparingdifferentsupervisedmachinelearningalgorithmsfordiseaseprediction
AT hossainmdekramul comparingdifferentsupervisedmachinelearningalgorithmsfordiseaseprediction
AT monimohammadali comparingdifferentsupervisedmachinelearningalgorithmsfordiseaseprediction