Cargando…

Hands-on training about overfitting

Overfitting is one of the critical problems in developing models by machine learning. With machine learning becoming an essential technology in computational biology, we must include training about overfitting in all courses that introduce this technology to students and practitioners. We here propo...

Descripción completa

Detalles Bibliográficos
Autores principales: Demšar, Janez, Zupan, Blaž
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7932115/
https://www.ncbi.nlm.nih.gov/pubmed/33661899
http://dx.doi.org/10.1371/journal.pcbi.1008671
_version_ 1783660414675976192
author Demšar, Janez
Zupan, Blaž
author_facet Demšar, Janez
Zupan, Blaž
author_sort Demšar, Janez
collection PubMed
description Overfitting is one of the critical problems in developing models by machine learning. With machine learning becoming an essential technology in computational biology, we must include training about overfitting in all courses that introduce this technology to students and practitioners. We here propose a hands-on training for overfitting that is suitable for introductory level courses and can be carried out on its own or embedded within any data science course. We use workflow-based design of machine learning pipelines, experimentation-based teaching, and hands-on approach that focuses on concepts rather than underlying mathematics. We here detail the data analysis workflows we use in training and motivate them from the viewpoint of teaching goals. Our proposed approach relies on Orange, an open-source data science toolbox that combines data visualization and machine learning, and that is tailored for education in machine learning and explorative data analysis.
format Online
Article
Text
id pubmed-7932115
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-79321152021-03-10 Hands-on training about overfitting Demšar, Janez Zupan, Blaž PLoS Comput Biol Education Overfitting is one of the critical problems in developing models by machine learning. With machine learning becoming an essential technology in computational biology, we must include training about overfitting in all courses that introduce this technology to students and practitioners. We here propose a hands-on training for overfitting that is suitable for introductory level courses and can be carried out on its own or embedded within any data science course. We use workflow-based design of machine learning pipelines, experimentation-based teaching, and hands-on approach that focuses on concepts rather than underlying mathematics. We here detail the data analysis workflows we use in training and motivate them from the viewpoint of teaching goals. Our proposed approach relies on Orange, an open-source data science toolbox that combines data visualization and machine learning, and that is tailored for education in machine learning and explorative data analysis. Public Library of Science 2021-03-04 /pmc/articles/PMC7932115/ /pubmed/33661899 http://dx.doi.org/10.1371/journal.pcbi.1008671 Text en © 2021 Demšar, Zupan http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Education
Demšar, Janez
Zupan, Blaž
Hands-on training about overfitting
title Hands-on training about overfitting
title_full Hands-on training about overfitting
title_fullStr Hands-on training about overfitting
title_full_unstemmed Hands-on training about overfitting
title_short Hands-on training about overfitting
title_sort hands-on training about overfitting
topic Education
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7932115/
https://www.ncbi.nlm.nih.gov/pubmed/33661899
http://dx.doi.org/10.1371/journal.pcbi.1008671
work_keys_str_mv AT demsarjanez handsontrainingaboutoverfitting
AT zupanblaz handsontrainingaboutoverfitting