Cargando…

Machine learning in medicine: a practical introduction

BACKGROUND: Following visible successes on a wide range of predictive tasks, machine learning techniques are attracting substantial interest from medical researchers and clinicians. We address the need for capacity development in this area by providing a conceptual introduction to machine learning a...

Descripción completa

Detalles Bibliográficos
Autores principales:	Sidey-Gibbons, Jenni A. M., Sidey-Gibbons, Chris J.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2019
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6425557/ https://www.ncbi.nlm.nih.gov/pubmed/30890124 http://dx.doi.org/10.1186/s12874-019-0681-4

_version_	1783404859582578688
author	Sidey-Gibbons, Jenni A. M. Sidey-Gibbons, Chris J.
author_facet	Sidey-Gibbons, Jenni A. M. Sidey-Gibbons, Chris J.
author_sort	Sidey-Gibbons, Jenni A. M.
collection	PubMed
description	BACKGROUND: Following visible successes on a wide range of predictive tasks, machine learning techniques are attracting substantial interest from medical researchers and clinicians. We address the need for capacity development in this area by providing a conceptual introduction to machine learning alongside a practical guide to developing and evaluating predictive algorithms using freely-available open source software and public domain data. METHODS: We demonstrate the use of machine learning techniques by developing three predictive models for cancer diagnosis using descriptions of nuclei sampled from breast masses. These algorithms include regularized General Linear Model regression (GLMs), Support Vector Machines (SVMs) with a radial basis function kernel, and single-layer Artificial Neural Networks. The publicly-available dataset describing the breast mass samples (N=683) was randomly split into evaluation (n=456) and validation (n=227) samples. We trained algorithms on data from the evaluation sample before they were used to predict the diagnostic outcome in the validation dataset. We compared the predictions made on the validation datasets with the real-world diagnostic decisions to calculate the accuracy, sensitivity, and specificity of the three models. We explored the use of averaging and voting ensembles to improve predictive performance. We provide a step-by-step guide to developing algorithms using the open-source R statistical programming environment. RESULTS: The trained algorithms were able to classify cell nuclei with high accuracy (.94 -.96), sensitivity (.97 -.99), and specificity (.85 -.94). Maximum accuracy (.96) and area under the curve (.97) was achieved using the SVM algorithm. Prediction performance increased marginally (accuracy =.97, sensitivity =.99, specificity =.95) when algorithms were arranged into a voting ensemble. CONCLUSIONS: We use a straightforward example to demonstrate the theory and practice of machine learning for clinicians and medical researchers. The principals which we demonstrate here can be readily applied to other complex tasks including natural language processing and image recognition. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12874-019-0681-4) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-6425557
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-64255572019-03-29 Machine learning in medicine: a practical introduction Sidey-Gibbons, Jenni A. M. Sidey-Gibbons, Chris J. BMC Med Res Methodol Research Article BACKGROUND: Following visible successes on a wide range of predictive tasks, machine learning techniques are attracting substantial interest from medical researchers and clinicians. We address the need for capacity development in this area by providing a conceptual introduction to machine learning alongside a practical guide to developing and evaluating predictive algorithms using freely-available open source software and public domain data. METHODS: We demonstrate the use of machine learning techniques by developing three predictive models for cancer diagnosis using descriptions of nuclei sampled from breast masses. These algorithms include regularized General Linear Model regression (GLMs), Support Vector Machines (SVMs) with a radial basis function kernel, and single-layer Artificial Neural Networks. The publicly-available dataset describing the breast mass samples (N=683) was randomly split into evaluation (n=456) and validation (n=227) samples. We trained algorithms on data from the evaluation sample before they were used to predict the diagnostic outcome in the validation dataset. We compared the predictions made on the validation datasets with the real-world diagnostic decisions to calculate the accuracy, sensitivity, and specificity of the three models. We explored the use of averaging and voting ensembles to improve predictive performance. We provide a step-by-step guide to developing algorithms using the open-source R statistical programming environment. RESULTS: The trained algorithms were able to classify cell nuclei with high accuracy (.94 -.96), sensitivity (.97 -.99), and specificity (.85 -.94). Maximum accuracy (.96) and area under the curve (.97) was achieved using the SVM algorithm. Prediction performance increased marginally (accuracy =.97, sensitivity =.99, specificity =.95) when algorithms were arranged into a voting ensemble. CONCLUSIONS: We use a straightforward example to demonstrate the theory and practice of machine learning for clinicians and medical researchers. The principals which we demonstrate here can be readily applied to other complex tasks including natural language processing and image recognition. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12874-019-0681-4) contains supplementary material, which is available to authorized users. BioMed Central 2019-03-19 /pmc/articles/PMC6425557/ /pubmed/30890124 http://dx.doi.org/10.1186/s12874-019-0681-4 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Article Sidey-Gibbons, Jenni A. M. Sidey-Gibbons, Chris J. Machine learning in medicine: a practical introduction
title	Machine learning in medicine: a practical introduction
title_full	Machine learning in medicine: a practical introduction
title_fullStr	Machine learning in medicine: a practical introduction
title_full_unstemmed	Machine learning in medicine: a practical introduction
title_short	Machine learning in medicine: a practical introduction
title_sort	machine learning in medicine: a practical introduction
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6425557/ https://www.ncbi.nlm.nih.gov/pubmed/30890124 http://dx.doi.org/10.1186/s12874-019-0681-4
work_keys_str_mv	AT sideygibbonsjenniam machinelearninginmedicineapracticalintroduction AT sideygibbonschrisj machinelearninginmedicineapracticalintroduction

Machine learning in medicine: a practical introduction

Ejemplares similares